## Monday, July 9, 2012

### Write your own bootstrap command.

* A similar function is also programmed in R.  See: http://www.econometricsbysimulation.com/2012/09/program-your-own-bootstrap-command-r.html

* The bootstrap resampling technique is one of the coolest techniques that have developed in econometrics/statistics in the last 50 years (that and maximum likelihood).  Bootstrap releases the user from strong assumptions about the underlying distribution of errors.

* As far as I understand the bootstrapping assumptions follow, so long as the data you draw are reasonable draws from the underlying distribution then resampling from that sample will give you a reasonable estimate of the populations underlying parameters. Note, this method is obviously related to Baysian methods.

* B Efron formalized the technique and termonology of Bootstrap in a paper in 1979.

* First let us define a program that will do our bootstrap
cap program drop bootme
program bootme

********
* First we will set up the data that we will use as a resource to draw from

* Preserve the current data
$nq preserve syntax anything [, Reps(integer 100) Command(string) Keep(string) Noisily] * Tell stata to display what it is doing or the default is not to if "noisily'" == "" local nq quietly if "noisily'" != "" local nq * Generate a variable that counts from 0 to _N-1 nq' gen bootmatcher = _n-1 * This is the number of observations to draw from. nq' gl boot_N=_N * We will save the data file to draw from first tempfile drawfile nq' save drawfile' , replace * Draw file set up ******** * Clear the memory nq' clear * Set the number of repetitions to that specified by the user nq' set obs reps' * Keep tells the bootstrap rountine what values from the estiamtion to keep foreach v in keep' { * Creates emptly variables to hold the coefficients of interest local vtemp=strtoname("vv'") nq' gen vtemp'=. } * Create a data file that will list the results of the bootstrap tempfile returnlist nq' save returnlist' , replace * Display the bootstraps progress di as text _newline "Bootstrap replications ({result:reps'})" di "----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5" * Loop through the number of repetitions in the bootstrap forv i=1/reps' { * First clear the data clear * Set the number of observations equal to the number to draw nq' set obs$boot_N
* Generate a variable that represents random draws.
nq' gen bootmatcher = int(runiform()*\$boot_N)
* Sort the draws from 1 to _N to assist in the merge command
nq' sort bootmatcher
* Merge in the drawfile
nq' merge m:1 bootmatcher using drawfile', keep(3)

* Now run the user specified command
nq' command'

* Now load up the return list file
nq' use returnlist', clear

* For each value of the list of keep values loop through
foreach v in keep' {
* We need to make sure the variables are readable as names
local vtemp=strtoname("vv'")
* Replace each observation with the estimated value
nq' replace vtemp'=v' if _n==i'
}
nq' save returnlist', replace

* If the observation is divisible by 50 insert a space in the progress bar.
if i'/50 == int(i'/50) di ".  " i'
else di _continue "."
}
* This summarizes the results
sum
* Restores the data to the previous state
nq' restore
* Runs the command one last time for display
`command'
end
* End of the Bootme command

clear
set obs 1000

gen x = rnormal()
gen u = rnormal()*10
gen y = x + u

* This is Stata's default bootstrap command
bs _b[x] _se[x] e(F) e(ll), reps(200): reg y x

* Our bootstrap seems to work very similarly though it seems to me that our standard errors seem to be a little wider in general.  I am not sure why.
bootme 1, r(200) k(_b[x] _se[x] e(F) e(ll)) c(reg y x)