* Use bootstrapped draws for simulating results - expand method
* This presents an alternative method to resampling results from the last post
webuse mheart0, clear
* First let's generate an observation index
gen obs_id = _n
* And you want to test how well an estimator will work on sampled data from that data set.
* There are obviously many ways to do this.
* One way would be to resample from that data 1,000 draws and then generate a dependent variable and test how well your estimator works.
sum
* First we want to mark the draws but we can see that bmi is missing some information.
* For our purposes we could either drop the observations for which bmi is missing or inearly impute bmi.
* Let's just impute bmi:
reg bmi age smokes attack female hsgrad marstatus alcohol hightar
predict bmi_fill
replace bmi = bmi_fill if bmi==.
sum bmi
drop bmi_fill
di "Now what we want is approximately 1,000 results (we do not need to be exact)"
di "We have " _N " observations"
local obs_add =1000/_N
di "So we need to add approximately " round(`obs_add') " observations per observation"
* One way to do this would be to add (or subtract) randomly more duplicate observations.
* The uniform distribution is a natural choice. However, its expected value is 1/2 so we need to multiply by 2 to ensure that we get the right number of observations.
gen add = round(`obs_add'*runiform()*2)
* Note: alternative distributions might be any non-negative distribution for which you can specify the expected value.
* For example: poission. This distribution will be less likely to drop observations and have more proportional representation of initial observations.
tab add
* First let's drop any obervations that are slated to be dropped
drop if add == 0
* Now the command expand is very useful because it allows us to easily duplicate observations
expand add
sum
tab obs_id
No comments:
Post a Comment