## Wednesday, May 16, 2012

### Testing the Rank Condition of IV Estimators

* One of the key assumptions of the IV estimators
* is that the rank E(Z'X)=K=# of elements of B.

* It is difficult to test the rank condition in
* general.  However following Wooldridge 2010 it
* is easy to test the rank condition of instrumental
* variables when there is only on endogenous variable
* xk

* xk = d1 + d2x2 + ... + dk-1 xk-1 + theta1*z1 + r

* In order for the rank condition to hold E(theta)!=0

*******************************************************
* Let's see this test in action:

clear
set obs 1000
set seed 100

gen x1 = rnormal()
gen x2 = rnormal()
gen x3 = rnormal()

* generate the unobserved factor V which contributes
* to create a correlation between xk and u.
gen V = rnormal()

* generate the instrumental variable
gen z = rnormal()

* generate xk the suspected endogenous variable.
gen xk = rnormal() + V + z

gen u = rnormal()*2 - 2*V

gen y = 1*x1 + 2*x2 + 4*x3 + xk + u

* Simulation End

reg y x1 x2 x3 xk
* We can see that xk is biased

* In order to test the rank condition of the IV estimator
* we estimate

reg xk x1 x2 x3 z
* z appears to be strongly significant.  Therefore we
* conclude that the rank condition is sustained.
* Therefore we can safely use the IV estimator:

ivreg y x1 x2 x3 (xk=z)
* So the IV estimator seems to be working pretty well.

* When might we expect the rank condition to fail?

*******************************************************
* When the instrumental variable is weak (has small
* explanatory power relative to xk):

clear
* Set the number of observations to generate to 1000.
set obs 1000
set seed 101

* generate the instrumental variable
gen z = rnormal()

gen x1 = rnormal()
gen x2 = rnormal()
gen x3 = rnormal()

* generate the unobserved factor V which contributes
* to create a correlation between xk and u.
gen V = rnormal()

* generate xk the suspected endogenous variable.
gen xk = 2*rnormal() + V + z*.1

reg xk x1 x2 x3 z
* We cannot see a statistical difference between
* theta and 0 (though we know one exists).

* note, just because our Rank condition test fails
* does not mean that the underlying rank condition
* fails.  In this case E(Z'X)!=0 therefore the
* rank condition does not fail in the population
* In order to see this trying increasing the number
* of observations.

* Generate the endogeneity between xk and yb
gen u = rnormal()*2 - 2*V

gen y = 1*x1 + 2*x2 + 4*x3 + xk + u

ivreg y x1 x2 x3 (xk=z)
* The IV estimator is simply too weak to get
* good estimates.

reg y x*
* Once again the OLS regression works pretty well even
* when the IV estimator is not working well.