Sunday, July 15, 2012

Use bootstrapped draws for simulating draws - expand method

* Use bootstrapped draws for simulating results - expand method

* This presents an alternative method to resampling results from the last post

webuse mheart0, clear

* First let's generate an observation index
gen obs_id = _n

* And you want to test how well an estimator will work on sampled data from that data set.

* There are obviously many ways to do this.

* One way would be to resample from that data 1,000 draws and then generate a dependent variable and test how well your estimator works.

sum

* First we want to mark the draws but we can see that bmi is missing some information.

* For our purposes we could either drop the observations for which bmi is missing or inearly impute bmi.

* Let's just impute bmi:

reg  bmi age smokes attack female hsgrad marstatus alcohol hightar

predict bmi_fill

replace bmi = bmi_fill if bmi==.

sum bmi
drop bmi_fill

di "Now what we want is approximately 1,000 results (we do not need to be exact)"

di "We have " _N " observations"

di "So we need to add approximately " round(obs_add') " observations per observation"

* One way to do this would be to add (or subtract) randomly more duplicate observations.

* The uniform distribution is a natural choice.  However, its expected value is 1/2 so we need to multiply by 2 to ensure that we get the right number of observations.

* Note: alternative distributions might be any non-negative distribution for which you can specify the expected value.
* For example: poission.  This distribution will be less likely to drop observations and have more proportional representation of initial observations.

* First let's drop any obervations that are slated to be dropped