Tuesday, May 15, 2012

The Weak Instrument Problem

* Instrumental variables are extremely useful in
* economics in general because they provide a
* mechanism to get around the issue that some
* important predictive variables of interest
* in economics might also be correlated with
* other unobservable factors that might affect
* the outcome.

* A typical example is years of education and wages:
* wage = alpha + beta*years_of_education + u

* The problem is that the number of years of
* education might be related to motivation which
* might also be related to u.  So
* corr(years_of_ed, u) != 0 and the
* normal OLS estimators will be biased
* because:

* Y = XB + u
* X'Y = X'XB + X'u
* (X'X)^-1 X'Y = B + (X'X)^-1 X'u

* If we assume that X'u are uncorrelated and
* the Xs are fixed then an unbiased estimate
* of B is B_hat:

* E((X'X)^-1 X'Y) = B + E((X'X)^-1 X'u)
* E((X'X)^-1 X'Y) = B
* B_hat = (X'X)^-1 X'Y

* But if corr(X,u)!=0 then:
* B_hat = (X'X)^-1 X'Y
* B_hat = (X'X)^-1 X'(XB + u)
* B_hat = (X'X)^-1 X'XB + (X'X)^-1 X'u
* B_hat = B + (X'X)^-1 X'u
* E(B_hat) = B + E((X'X)^-1 X'u)

* to show unbiasedness: Assume fixed Xs
* E(B_hat) = B + (X'X)^-1 E(X'u)
* E(X'u)!=0 if corr(X,u)!=0
* E(B_hat) = B

* to show consistency:
* plim(B_hat) = B + (X'X)^-1 plim(X'u)
* plim(X'u)!=0 if corr(X,u)!=0
* plim(B_hat) = B

* So in order to get around this economists
* often use the instrumental variables estimator.

* Imagine that there is some instrument Z
* which is uncorrelated with u and
* is correlated with X. Z must also be
* uncorrelated with B in the standard case.

* Y = XB + u
* Z'Y = Z'XB + Z'u
* (Z'X)^-1 Z'Y = (Z'X)^-1 Z'XB + (Z'X)^-1 Z'u
* (Z'X)^-1 Z'Y = B + (Z'X)^-1 Z'u

* Since we have assumed corr(Z'u) = 0
* Z'u=0 at the population level.

* Thus we get the IV estimator:
* (Z'X)^-1 Z'Y = B_hat
* B_hat = (Z'X)^-1 Z'Y

* To show consistency:
* B_hat = (Z'X)^-1 Z'Y
* B_hat = (Z'X)^-1 Z'(XB + u)
* B_hat = (Z'X)^-1 Z'XB + (Z'X)^-1 Z'u
* B_hat =  B + (Z'X)^-1 Z'u

* plim(B_hat) =  B + plim((Z'X)^-1 Z'u)
* plim(B_hat) =  B + (Z'X)^-1 plim(Z'u)
* plim(B_hat) =  B

* Note: the IV estimator is not unbiased but
* it is consistent!

* However, if corr(Z,u) != 0 even if it is small
* and the instrument is "weak" that is
* |Z'X| is small then we have a weak instrument
* problem.  This is because (Z'X)^-1 is 1 over
* a small number which will means that
* even a slight correlation in corr(Z,u) can cause
* a large bias in the IV estimator.

* Let's see how that works!

clear
set obs 1000

* First the standard IV estimator
gen Z1 = rnormal()

* However, Z is not correlated with u yet.
gen u1 = rnormal()

* So Z is weak and remember u1 has to be
* correlated with X in order to justify
* use of the IV estimator:
gen X1 = rnormal()*10 + Z1 + u1*2

reg X1 Z1

gen Y1 = 3*X1 + 5*u1

ivreg Y1 (X1=Z1)
* We can see that Z is a pretty effective
* instrument at estimating B even if
* the correlation between Z and X is small.

******************************************
* Now let's see what happens when there
* is a small correlation between Z and u

* Imagine there is some additional
* explanatory variable V which is unobserved
* but partially explains the instrument
* as well.

gen V = rnormal()

* First the standard IV estimator
gen Z2 = rnormal() - V

* However, Z is somewhat correlated with u.
gen u2 = 10*rnormal() + V/4

* So Z is weak and remember u2 has to be
* correlated with X in order to justify
* use of the IV estimator:
gen X2 = rnormal()*10 + Z2/4 + .1*u2

reg X2 Z2

gen Y2 = 3*X2 + 5*u2

ivreg Y2 (X2=Z2)
* In this case we get a better estimate of
* B by using the known biased OLS estimator
* than by using the instrumental variables
* estimator.

reg Y2 X2

* Even though the correlation between the
* instrument and the error is small
pwcorr Z2 u2, sig

******************************************
* This is primarily a function of the weakness
* of Z at explaining X.  See what happens
* if Z has more explanatory power

gen X3 = rnormal()*10 + 10*Z2 + .1*u2

reg X3 Z2

gen Y3 = 3*X3 + 5*u2

ivreg Y3 (X3=Z2)
* Now even though Z is slightly correlated
* with the error u.

reg Y2 X2

* Even though the correlation between
* Z and u is the same as previously,
* the strength of the instruments
* in explaining X wins out and gives us a
* better estimator than OLS.

1 comment:

  1. Is there any test for whether the weak instrument can still explain thing in Stata? Something like the Chernozhukov-Hansen test? Thanks.

    ReplyDelete