* The following simulation is testing how well a 2 stage quantile regression can work. The example is a little unlikely but the methods should still be good.
* In the first stage we will assume the coefficient on the instrument is constant.
* In the second stage we will assume that the coefficient on the endogenous variable w is changing in y.
* That is we want to estimate quantile(Y|w)=wB + u med(u|z)=0
* w = gamma0 + z*gamma1 + v
* E(w|z) = gamma0 + z*gamma1
* w_hat = gamma0_hat + z*gamma1_hat
* quantile(Y|w_hat)=w_hat B + u med(u|w_hat)=0 or something like that
* but that med(u|w)!=0
* The properties of quantile regression are difficult because of the non-linearities involved in median maximization.
* However, we can test the properties of 2 stage quantile regression through simulation!
* Imagine we would like to estimate how good are weapons are at killing zombies.
* However you are afraid that people who own weapons might also be more militaristic people and in general might be more effective at killing zombies.
* Therefore, you would like to instrument for the likelihood of owning a weapons.
* You would like to know three things.
* 1. For the bottom 25% zombie killer how much does owning weapons improve their ability to kill zombies?
* 2. For the median person (typical person), how much does owning weapons improve zombie killing ability?
* 3. For the top 75% zombie killer how much does owning weapons improve their ability to kill zombies?
* Let us first generate the data
set seed 101
clear
local num_obs = 10000
set obs `num_obs'
gen fitness = runiform()
gen militarism = runiform()
* Your instrument is that some people live in areas that are more weapon friendly prior to the zombie outbreak.
gen weapon_ease = runiform()
label var weapon_ease "The ease by which people can purchase weapons in the area"
* Assume the likelihood of people being militaristic is unrelated to the area the live (unlikely).
gen weapons = weapon_ease + militarism
gen error = 5*rnormal()
* Let's get a general estimate of the effectiveness of weapons
gen weapon_coef = 1
gen zombie_kills = fitness + militarism + weapon_coef*weapons + rnormal()
forv i=1(1)10 {
sort zombie_kills
replace weapon_coef=4*(_n/`num_obs')
replace zombie_kills = 5*fitness + 7* militarism + weapon_coef*weapons + error + 15
* Note partial kills are possible because assists do not count as full kills.
}
* We know the true coefficient on weapons at 25% is 1 at 50% is 2 and at 75% is 3
scatter weapon_coef zombie_kills, sort
sum weapon_coef if _n==`num_obs'*1/4
sum weapon_coef if _n==`num_obs'*2/4
sum weapon_coef if _n==`num_obs'*3/4
* Let us see how well we can recover the coefficients:
* First: militarism is unobservable
drop militarism
* Let us first try the nieve regression
qreg zombie_kills fitness weapons, quantile(.25)
qreg zombie_kills fitness weapons, quantile(.50)
qreg zombie_kills fitness weapons, quantile(.75)
* We can see that at all levels weapons appear far more effective than they actually are.
* Let us try 2SQReg
reg weapons weapon_ease
* Looks like a pretty good estimate
predict weapons_hat
predict uhat, resid
qreg zombie_kills fitness weapons_hat, quantile(.25)
qreg zombie_kills fitness weapons_hat, quantile(.50)
qreg zombie_kills fitness weapons_hat, quantile(.75)
* It appears that 2SQreg while not perfect is much better than qreg.
* Let us try a control function formulation
qreg zombie_kills fitness weapons uhat, quantile(.25)
qreg zombie_kills fitness weapons uhat, quantile(.50)
qreg zombie_kills fitness weapons uhat, quantile(.75)
* We can see that while the estimates of 2SQreg is identical to that of the control function they both appear to be effective methods.
* Let us do a monte Carlo Simulation of the who thing again:
cap program drop s2qreg
program define s2qreg, rclass
clear
local num_obs = 10000
set obs `num_obs'
gen fitness = runiform()
gen militarism = runiform()
* Your instrument is that some people live in areas that are more weapon friendly prior to the zombie outbreak.
gen weapon_ease = runiform()
label var weapon_ease "The ease by which people can purchase weapons in the area"
* Assume the likelihood of people being militaristic is unrelated to the area the live (unlikely).
gen weapons = weapon_ease + militarism
gen error = 5*rnormal()
* Let's get a general estimate of the effectiveness of weapons
gen weapon_coef = 1
gen zombie_kills = fitness + militarism + weapon_coef*weapons + rnormal()
forv i=1(1)10 {
sort zombie_kills
replace weapon_coef=4*(_n/`num_obs')
replace zombie_kills = 5*fitness + 7* militarism + weapon_coef*weapons + error + 15
}
qreg zombie_kills fitness weapons, quantile(.25)
return scalar qreg25=_b[weapons]
qreg zombie_kills fitness weapons, quantile(.50)
return scalar qreg5=_b[weapons]
qreg zombie_kills fitness weapons, quantile(.75)
return scalar qreg75=_b[weapons]
reg weapons weapon_ease
predict weapons_hat
predict uhat, resid
qreg zombie_kills fitness weapons_hat, quantile(.25)
return scalar s2qreg25=_b[weapons]
qreg zombie_kills fitness weapons_hat, quantile(.50)
return scalar s2qreg5=_b[weapons]
qreg zombie_kills fitness weapons_hat, quantile(.75)
return scalar s2qreg75=_b[weapons]
qreg zombie_kills fitness weapons uhat, quantile(.25)
return scalar cfqreg25=_b[weapons]
qreg zombie_kills fitness weapons uhat, quantile(.50)
return scalar cfqreg5=_b[weapons]
qreg zombie_kills fitness weapons uhat, quantile(.75)
return scalar cfqreg75=_b[weapons]
end
s2qreg
return list
simulate qreg25=r(qreg25) s2qreg25=r(s2qreg25) cfqreg25=r(cfqreg25) ///
qreg5=r(qreg5) s2qreg5=r(s2qreg5) cfqreg5=r(cfqreg5) ///
qreg75=r(qreg75) s2qreg75=r(s2qreg75) cfqreg75=r(cfqreg75), ///
reps(500): s2qreg
sum
How do you obtain ivqreg. The findint ivqreg command does not work.
ReplyDeleteThis post shows several methods of constructing a ivqreg command by hand. I think if you read though it, it might make sense.
Delete