* One of the key assumptions of the IV estimators
* is that the rank E(Z'X)=K=# of elements of B.
* It is difficult to test the rank condition in
* general. However following Wooldridge 2010 it
* is easy to test the rank condition of instrumental
* variables when there is only on endogenous variable
* xk
* xk = d1 + d2x2 + ... + dk-1 xk-1 + theta1*z1 + r
* In order for the rank condition to hold E(theta)!=0
*******************************************************
* Let's see this test in action:
clear
set obs 1000
set seed 100
gen x1 = rnormal()
gen x2 = rnormal()
gen x3 = rnormal()
* generate the unobserved factor V which contributes
* to create a correlation between xk and u.
gen V = rnormal()
* generate the instrumental variable
gen z = rnormal()
* generate xk the suspected endogenous variable.
gen xk = rnormal() + V + z
gen u = rnormal()*2 - 2*V
gen y = 1*x1 + 2*x2 + 4*x3 + xk + u
* Simulation End
reg y x1 x2 x3 xk
* We can see that xk is biased
* In order to test the rank condition of the IV estimator
* we estimate
reg xk x1 x2 x3 z
* z appears to be strongly significant. Therefore we
* conclude that the rank condition is sustained.
* Therefore we can safely use the IV estimator:
ivreg y x1 x2 x3 (xk=z)
* So the IV estimator seems to be working pretty well.
* When might we expect the rank condition to fail?
*******************************************************
* When the instrumental variable is weak (has small
* explanatory power relative to xk):
clear
* Set the number of observations to generate to 1000.
set obs 1000
set seed 101
* generate the instrumental variable
gen z = rnormal()
gen x1 = rnormal()
gen x2 = rnormal()
gen x3 = rnormal()
* generate the unobserved factor V which contributes
* to create a correlation between xk and u.
gen V = rnormal()
* generate xk the suspected endogenous variable.
gen xk = 2*rnormal() + V + z*.1
reg xk x1 x2 x3 z
* We cannot see a statistical difference between
* theta and 0 (though we know one exists).
* note, just because our Rank condition test fails
* does not mean that the underlying rank condition
* fails. In this case E(Z'X)!=0 therefore the
* rank condition does not fail in the population
* In order to see this trying increasing the number
* of observations.
* Generate the endogeneity between xk and yb
gen u = rnormal()*2 - 2*V
gen y = 1*x1 + 2*x2 + 4*x3 + xk + u
ivreg y x1 x2 x3 (xk=z)
* The IV estimator is simply too weak to get
* good estimates.
reg y x*
* Once again the OLS regression works pretty well even
* when the IV estimator is not working well.
No comments:
Post a Comment