* One of the key assumptions of the IV estimators

* is that the rank E(Z'X)=K=# of elements of B.

* It is difficult to test the rank condition in

* general. However following Wooldridge 2010 it

* is easy to test the rank condition of instrumental

* variables when there is only on endogenous variable

* xk

* xk = d1 + d2x2 + ... + dk-1 xk-1 + theta1*z1 + r

* In order for the rank condition to hold E(theta)!=0

*******************************************************

* Let's see this test in action:

clear

set obs 1000

set seed 100

gen x1 = rnormal()

gen x2 = rnormal()

gen x3 = rnormal()

* generate the unobserved factor V which contributes

* to create a correlation between xk and u.

gen V = rnormal()

* generate the instrumental variable

gen z = rnormal()

* generate xk the suspected endogenous variable.

gen xk = rnormal() + V + z

gen u = rnormal()*2 - 2*V

gen y = 1*x1 + 2*x2 + 4*x3 + xk + u

* Simulation End

reg y x1 x2 x3 xk

* We can see that xk is biased

* In order to test the rank condition of the IV estimator

* we estimate

reg xk x1 x2 x3 z

* z appears to be strongly significant. Therefore we

* conclude that the rank condition is sustained.

* Therefore we can safely use the IV estimator:

ivreg y x1 x2 x3 (xk=z)

* So the IV estimator seems to be working pretty well.

* When might we expect the rank condition to fail?

*******************************************************

* When the instrumental variable is weak (has small

* explanatory power relative to xk):

clear

* Set the number of observations to generate to 1000.

set obs 1000

set seed 101

* generate the instrumental variable

gen z = rnormal()

gen x1 = rnormal()

gen x2 = rnormal()

gen x3 = rnormal()

* generate the unobserved factor V which contributes

* to create a correlation between xk and u.

gen V = rnormal()

* generate xk the suspected endogenous variable.

gen xk = 2*rnormal() + V + z*.1

reg xk x1 x2 x3 z

* We cannot see a statistical difference between

* theta and 0 (though we know one exists).

* note, just because our Rank condition test fails

* does not mean that the underlying rank condition

* fails. In this case E(Z'X)!=0 therefore the

* rank condition does not fail in the population

* In order to see this trying increasing the number

* of observations.

* Generate the endogeneity between xk and yb

gen u = rnormal()*2 - 2*V

gen y = 1*x1 + 2*x2 + 4*x3 + xk + u

ivreg y x1 x2 x3 (xk=z)

* The IV estimator is simply too weak to get

* good estimates.

reg y x*

* Once again the OLS regression works pretty well even

* when the IV estimator is not working well.

## No comments:

## Post a Comment