Monday, June 25, 2012

R vs Stata Non-linear least squares!


# NLS (non-linear least squares) R simulation followed by parallel Stata simulation

# Assumptions:
# For some T0 as an element of THETA = parameter space
# NLS.1 E(y|x) = m(x,T0)
# For T1 an element of the parameter space THETA
# NLS.2 E{[[m(x,T0)-m(x,T1)]^2}>0 for all T0!=T1

# NLS must have uniform convergence in probability:
# Meaning that Max {theta} |N^-1 * sum{q(wi, theta)-E(q(w,theta))}|->0

# This is to say that the difference between q and E(q) must be bounded as P->infinity  for all thetas.

#1. q(theta,wi) = theta*wi^2, so we know that for any theta the difference between it and its expected value can always be bounded.

#2. q(theta,wi) = w/theta, might be more difficult.  If theta == 0 then theta is undefined and the difference clearly cannot be bounded.

# Let's start with 10000 observations

obs_id=1:10000

set.seed(12)

# Now let's generate a variable w with 1000 random normal draws
w = rnorm(10000)

cbind(obs_id,w)[1:50,]
# The first 50 rows look good.

u = rnorm(10000)

theta = 2

y1 = theta*w^2 + u

y2 = w/theta + u

cbind(obs_id,w, y1, y2)[1:50,]
# The first 50 rows look good.
x=w
nls(y1~A*x^2, start=list(A=1))

nls(y2~x/B, start=list(B=1))

# Both these estimators work well.

# Let's try something a bit more complicated:

y3 = y1 + y2

nls(y1~A*x^2+x/B, start=list(A=1,B=1))

# R even has difficulty finding the solution to this estimator.

#############################################################
#   Now switching back to STATA
*  (same idea as above)
*************************************************************

clear
set seed 12
set obs 10000

gen w = rnormal()

gen u = rnormal()

gen theta = 2

gen y1 = theta*w^2 + u

gen y2 = w/theta + u

nl (y1 = {A=1}*w^2)
* It seems that the algorithm in Stata is not as precise as that used in R

nl (y2 = w/{B=1})
* It seems that the algorithm in Stata is not as precise as that used in R

gen y3 = y1 + y2

nl (y3 = {A=1}*w^2 + w/{B=1})
* However, the algorithm is able to solve this non-linear problem

1 comment:

  1. nlsLM from package minpack.lm solves the problem, this link explains why: http://www.r-bloggers.com/a-better-nls/.

    ReplyDelete