## Monday, February 4, 2013

### Potential IV Challenges even with RCTs

* You design an RCT with a single experimental binary treatment.

* But you think your instrument has the power to influence multiple endogenous response variables.

* We know we cannot use a single instrument to instrument for more than one variable jointly.

* Does it still work as a valid instrument if it affects other variables that can in turn affect our outcome?

clear
set obs 1000

gen z = rbinomial(1,.5)

gen end1 = rnormal()
gen end2 = rnormal()
* end1 and end2 are the endogenous component of x1 and x2 which will bias our results if we are not careful.

gen x1 = rnormal()+z+end1
gen x2 = rnormal()+z+end2

gen u = rnormal()*3

gen y = x1 + x2 + end1 + end2 + u

reg y x1 x2
* Clearly the OLS estimator is biased.

ivreg y (x1=z)
ivreg y (x2=z)
* Trying to use the IV individually only makes things worse.

ivreg y x1 (x2=z)
* Trying to control for x1 does not help.

* So what do we do?

* If we have control over how the instrument is being constructed then we think things through carefully making sure that if we are interested on the effect of x1 on y then our instrument only results in a direct effect and no additional affect on x2.

* Unfortunately, this task can not always be feasible.

* Another consideration we should take into account is potential variable responses to the istrument.

* For instance:

clear
set obs 1000

gen z = rbinomial(1,.5)

gen end1 = rnormal()
* end1 and end2 are the endogenous component of x1 and x2 which will bias our results if we are not careful.

gen g1 = rnormal()
gen grc = g1+1
label var grc "Random Coefficient on the Instrument"

gen x1 = rnormal()+z*grc+(end1)

gen u = rnormal()*3

gen brc = g1+1
label var brc "Random Coefficient on the Endogenous Variable"

gen y = x1*brc + end1 + u

reg y x1
* OLS is still biased.

ivreg y (x1=z)
* Unfortunatley, now too is IV.

* This is because, though our instrument is uncorrelated our errors, the response to the instrument is variable and that response may also be correlated with a variable response on the variable of interest.

* Thus, though corr(u,z)=0 and cor(x,z)!=0, we can still have problems.

* This particular problem is actually something that I am working on with Jeffrey Wooldridge and Yali Wang.