## Tuesday, July 24, 2012

### Write your own estimator and bootstrap the standard errors

* Let's imagine that you have some new command that you would like to use and you do not know how to solve the math to calculate the standard errors.

* For instance, two stage quantile regression would be extremely challenging.
clear
set obs 1000

* There is an exogenous component to w which is z
gen z = rnormal()

* There is an endogenous component to w which is v
* It is drawn from a Cauchy distribution.
* See (previous postfor more information on drawing from non-standard distributions.
gen v = tan(_pi*(runiform()-.5))

gen w = z + v

* There is also a standard addative error in the model
* u1 is a portion of the error that is independent of w
* It is drawn from a Cauchy distribution.
gen u1 = tan(_pi*(runiform()-.5))

gen u = (v+u1*3)*5

* Thus y is generated this way
gen y = -14 + 5*w + u

* In a Cauchy distribution the median is defined but the mean and variance are not.
* Thus it is inappropriate and often infeasible to use OLS or many common estimators.

* Quantile regression however is theoretically justified.

* That said:
reg y w
* Looks like it is working fine.

* But just for the sake of doing the consistent thing:
qreg y w

* We can see that both estimators are heavily biased because of the endogeneity of w.

* We want to estimate a two stage quantile regression
qreg w z

predict w_hat

qreg y w_hat
* It looks pretty good but we need to estimate the standard errors because that generated from just qreg on the predicted values is ignoring the variance in the original prediction.

* We will write a short program:

cap program drop twostageqreg
program define twostageqreg

qreg w z
* First let's drop w_hat if it is already defined
cap drop w_hat

predict w_hat
qreg y w_hat

end

* Now let's bootstrap
bs: twostageqreg

* Our two stage bootstrap looks like it is working pretty well.
* The standard errors are smaller than in the second stage estimate.
* This is somewhat surprising.