* economics in general because they provide a
* mechanism to get around the issue that some
* important predictive variables of interest
* in economics might also be correlated with
* other unobservable factors that might affect
* the outcome.
* A typical example is years of education and wages:
* wage = alpha + beta*years_of_education + u
* The problem is that the number of years of
* education might be related to motivation which
* might also be related to u. So
* corr(years_of_ed, u) != 0 and the
* normal OLS estimators will be biased
* because:
* Y = XB + u
* X'Y = X'XB + X'u
* (X'X)^-1 X'Y = B + (X'X)^-1 X'u
* If we assume that X'u are uncorrelated and
* the Xs are fixed then an unbiased estimate
* of B is B_hat:
* E((X'X)^-1 X'Y) = B + E((X'X)^-1 X'u)
* E((X'X)^-1 X'Y) = B
* B_hat = (X'X)^-1 X'Y
* But if corr(X,u)!=0 then:
* B_hat = (X'X)^-1 X'Y
* B_hat = (X'X)^-1 X'(XB + u)
* B_hat = (X'X)^-1 X'XB + (X'X)^-1 X'u
* B_hat = B + (X'X)^-1 X'u
* E(B_hat) = B + E((X'X)^-1 X'u)
* to show unbiasedness: Assume fixed Xs
* E(B_hat) = B + (X'X)^-1 E(X'u)
* E(X'u)!=0 if corr(X,u)!=0
* E(B_hat) = B
* to show consistency:
* plim(B_hat) = B + (X'X)^-1 plim(X'u)
* plim(X'u)!=0 if corr(X,u)!=0
* plim(B_hat) = B
* So in order to get around this economists
* often use the instrumental variables estimator.
* Imagine that there is some instrument Z
* which is uncorrelated with u and
* is correlated with X. Z must also be
* uncorrelated with B in the standard case.
* Y = XB + u
* Z'Y = Z'XB + Z'u
* (Z'X)^-1 Z'Y = (Z'X)^-1 Z'XB + (Z'X)^-1 Z'u
* (Z'X)^-1 Z'Y = B + (Z'X)^-1 Z'u
* Since we have assumed corr(Z'u) = 0
* Z'u=0 at the population level.
* Thus we get the IV estimator:
* (Z'X)^-1 Z'Y = B_hat
* B_hat = (Z'X)^-1 Z'Y
* To show consistency:
* B_hat = (Z'X)^-1 Z'Y
* B_hat = (Z'X)^-1 Z'(XB + u)
* B_hat = (Z'X)^-1 Z'XB + (Z'X)^-1 Z'u
* B_hat = B + (Z'X)^-1 Z'u
* plim(B_hat) = B + plim((Z'X)^-1 Z'u)
* plim(B_hat) = B + (Z'X)^-1 plim(Z'u)
* plim(B_hat) = B
* Note: the IV estimator is not unbiased but
* it is consistent!
* However, if corr(Z,u) != 0 even if it is small
* and the instrument is "weak" that is
* |Z'X| is small then we have a weak instrument
* problem. This is because (Z'X)^-1 is 1 over
* a small number which will means that
* even a slight correlation in corr(Z,u) can cause
* a large bias in the IV estimator.
* Let's see how that works!
clear
set obs 1000
* First the standard IV estimator
gen Z1 = rnormal()
* However, Z is not correlated with u yet.
gen u1 = rnormal()
* So Z is weak and remember u1 has to be
* correlated with X in order to justify
* use of the IV estimator:
gen X1 = rnormal()*10 + Z1 + u1*2
reg X1 Z1
gen Y1 = 3*X1 + 5*u1
ivreg Y1 (X1=Z1)
* We can see that Z is a pretty effective
* instrument at estimating B even if
* the correlation between Z and X is small.
******************************************
* Now let's see what happens when there
* is a small correlation between Z and u
* Imagine there is some additional
* explanatory variable V which is unobserved
* but partially explains the instrument
* as well.
gen V = rnormal()
* First the standard IV estimator
gen Z2 = rnormal() - V
* However, Z is somewhat correlated with u.
gen u2 = 10*rnormal() + V/4
* So Z is weak and remember u2 has to be
* correlated with X in order to justify
* use of the IV estimator:
gen X2 = rnormal()*10 + Z2/4 + .1*u2
reg X2 Z2
gen Y2 = 3*X2 + 5*u2
ivreg Y2 (X2=Z2)
* In this case we get a better estimate of
* B by using the known biased OLS estimator
* than by using the instrumental variables
* estimator.
reg Y2 X2
* Even though the correlation between the
* instrument and the error is small
pwcorr Z2 u2, sig
******************************************
* This is primarily a function of the weakness
* of Z at explaining X. See what happens
* if Z has more explanatory power
gen X3 = rnormal()*10 + 10*Z2 + .1*u2
reg X3 Z2
gen Y3 = 3*X3 + 5*u2
ivreg Y3 (X3=Z2)
* Now even though Z is slightly correlated
* with the error u.
reg Y2 X2
* Even though the correlation between
* Z and u is the same as previously,
* the strength of the instruments
* in explaining X wins out and gives us a
* better estimator than OLS.
Is there any test for whether the weak instrument can still explain thing in Stata? Something like the Chernozhukov-Hansen test? Thanks.
ReplyDelete