* economics in general because they provide a

* mechanism to get around the issue that some

* important predictive variables of interest

* in economics might also be correlated with

* other unobservable factors that might affect

* the outcome.

* A typical example is years of education and wages:

* wage = alpha + beta*years_of_education + u

* The problem is that the number of years of

* education might be related to motivation which

* might also be related to u. So

* corr(years_of_ed, u) != 0 and the

* normal OLS estimators will be biased

* because:

* Y = XB + u

* X'Y = X'XB + X'u

* (X'X)^-1 X'Y = B + (X'X)^-1 X'u

* If we assume that X'u are uncorrelated and

* the Xs are fixed then an unbiased estimate

* of B is B_hat:

* E((X'X)^-1 X'Y) = B + E((X'X)^-1 X'u)

* E((X'X)^-1 X'Y) = B

* B_hat = (X'X)^-1 X'Y

* But if corr(X,u)!=0 then:

* B_hat = (X'X)^-1 X'Y

* B_hat = (X'X)^-1 X'(XB + u)

* B_hat = (X'X)^-1 X'XB + (X'X)^-1 X'u

* B_hat = B + (X'X)^-1 X'u

* E(B_hat) = B + E((X'X)^-1 X'u)

* to show unbiasedness: Assume fixed Xs

* E(B_hat) = B + (X'X)^-1 E(X'u)

* E(X'u)!=0 if corr(X,u)!=0

* E(B_hat) = B

* to show consistency:

* plim(B_hat) = B + (X'X)^-1 plim(X'u)

* plim(X'u)!=0 if corr(X,u)!=0

* plim(B_hat) = B

* So in order to get around this economists

* often use the instrumental variables estimator.

* Imagine that there is some instrument Z

* which is uncorrelated with u and

* is correlated with X. Z must also be

* uncorrelated with B in the standard case.

* Y = XB + u

* Z'Y = Z'XB + Z'u

* (Z'X)^-1 Z'Y = (Z'X)^-1 Z'XB + (Z'X)^-1 Z'u

* (Z'X)^-1 Z'Y = B + (Z'X)^-1 Z'u

* Since we have assumed corr(Z'u) = 0

* Z'u=0 at the population level.

* Thus we get the IV estimator:

* (Z'X)^-1 Z'Y = B_hat

* B_hat = (Z'X)^-1 Z'Y

* To show consistency:

* B_hat = (Z'X)^-1 Z'Y

* B_hat = (Z'X)^-1 Z'(XB + u)

* B_hat = (Z'X)^-1 Z'XB + (Z'X)^-1 Z'u

* B_hat = B + (Z'X)^-1 Z'u

* plim(B_hat) = B + plim((Z'X)^-1 Z'u)

* plim(B_hat) = B + (Z'X)^-1 plim(Z'u)

* plim(B_hat) = B

* Note: the IV estimator is not unbiased but

* it is consistent!

* However, if corr(Z,u) != 0 even if it is small

* and the instrument is "weak" that is

* |Z'X| is small then we have a weak instrument

* problem. This is because (Z'X)^-1 is 1 over

* a small number which will means that

* even a slight correlation in corr(Z,u) can cause

* a large bias in the IV estimator.

* Let's see how that works!

clear

set obs 1000

* First the standard IV estimator

gen Z1 = rnormal()

* However, Z is not correlated with u yet.

gen u1 = rnormal()

* So Z is weak and remember u1 has to be

* correlated with X in order to justify

* use of the IV estimator:

gen X1 = rnormal()*10 + Z1 + u1*2

reg X1 Z1

gen Y1 = 3*X1 + 5*u1

ivreg Y1 (X1=Z1)

* We can see that Z is a pretty effective

* instrument at estimating B even if

* the correlation between Z and X is small.

******************************************

* Now let's see what happens when there

* is a small correlation between Z and u

* Imagine there is some additional

* explanatory variable V which is unobserved

* but partially explains the instrument

* as well.

gen V = rnormal()

* First the standard IV estimator

gen Z2 = rnormal() - V

* However, Z is somewhat correlated with u.

gen u2 = 10*rnormal() + V/4

* So Z is weak and remember u2 has to be

* correlated with X in order to justify

* use of the IV estimator:

gen X2 = rnormal()*10 + Z2/4 + .1*u2

reg X2 Z2

gen Y2 = 3*X2 + 5*u2

ivreg Y2 (X2=Z2)

* In this case we get a better estimate of

* B by using the known biased OLS estimator

* than by using the instrumental variables

* estimator.

reg Y2 X2

* Even though the correlation between the

* instrument and the error is small

pwcorr Z2 u2, sig

******************************************

* This is primarily a function of the weakness

* of Z at explaining X. See what happens

* if Z has more explanatory power

gen X3 = rnormal()*10 + 10*Z2 + .1*u2

reg X3 Z2

gen Y3 = 3*X3 + 5*u2

ivreg Y3 (X3=Z2)

* Now even though Z is slightly correlated

* with the error u.

reg Y2 X2

* Even though the correlation between

* Z and u is the same as previously,

* the strength of the instruments

* in explaining X wins out and gives us a

* better estimator than OLS.

Is there any test for whether the weak instrument can still explain thing in Stata? Something like the Chernozhukov-Hansen test? Thanks.

ReplyDelete