## Thursday, November 8, 2012

### Power Analysis with Non-Linear Least Squares: A Simulation Approach

Original Code

* Typical (easy) power analysis relies upon drawing from known distributions often with differences in means using estimators which are unbiased and for which variances are easy to estimate.

* However, many econometric procedures are not unbiased but rather instead only instead consistent and have likewise difficult to estimate variances.

* In order to estimate power under these circumstances simulations can be useful.

* Andreas Tiemann recently informed me of this phenomena citing an interesting article by Benjamin Arnold, Daniel Hogan, John Colford, and Alan Hubbard 2011(http://www.biomedcentral.com/1471-2288/11/94/abstract).

* This post will estimate the sample size required to reduce the chance of false rejection to 5% (Type I error) and have a minimum chance of correct rejection of 80% (Type II error).

* I will use the example recently posted on Dave Giles blog October 30 on Non-linear Least Squares (http://davegiles.blogspot.com/2012/10/some-properties-of-non-linear-least.html#more)

* This code is similar to code shared with me by Andreas Tiemann and Juan JosÃ© Matta.

cap program drop NLLS_sim
program define NLLS_sim, rclass

clear
* This first argument is the number of observations
set obs `1'

* I will generate x2 as a uniform
gen x2 = runiform()

* Generate the error
gen e = rnormal()

* Generate the parameters to estimate (these are the second and third arguments following the command name.
local beta1 = `2'
local beta2 = `3'

gen y = `beta1' + (`beta2'*x2)^`beta1' + e

* Run the non-linear least squares estimation.  Note: I set the initial value of beta1 to .5 because it was having difficulty converging.
nl (y = {beta1=1} + ({beta2}*x2)^{beta1})
* I don't know any way of directly extracting the p values but they can be recomputed
local p1 = 2 * ttail(e(df_r), abs(_b[/beta1]/_se[/beta1]))
* Return p values and significance levels at 5%
cap return scalar p1 = `p1'
cap return scalar sig1 = `p1'<.05

local p2 = 2 * ttail(e(df_r), abs(_b[/beta2]/_se[/beta2]))
cap return scalar p2 = `p2'
cap return scalar sig2 = `p2'<.05
end

* Now let's test out our program
NLLS_sim 10 1 2
return list
* Looking good.  The p values are the same and the significance is properly rejecting

* First I would like to warn that this can take a while.

local sig = 1
local n=2

* First let's estimate the likelihood of falsely rejecting the null for any sample size.
while `sig'>.5 {
local n=`n'+1

* Number of observations:
local nobs = ceil(10^(`n'/3))

simulate sig2=r(sig2), reps(100): NLLS_sim `nobs' 1 0

qui: sum sig2
local sig=r(mean)

di "We reject the null `sig2'% of the time where beta2=0"
}

local sig = 0

* We can let n start at the value from the previous simulation since

* Now let's estimate the sample size required to correctly reject the null 80% of the time.
while `sig'<.8 {

* Number of observations:
local nobs = ceil(10^(`n'/3))

simulate sig2=r(sig2), reps(100): NLLS_sim `nobs' 1 1

qui: sum sig2
local sig=r(mean)

di "We reject the null `sig2'% of the time where beta2=1"
local n=`n'+1
}