Thursday, May 17, 2012
Unobserved fixed effects model
* Often times we are concerned that there are some unobserved
* factors which are correlated with our explanatory variables x
* as well as with our error term u.
* For example, we might be concerned that intelligence is
* correlated with years of schooling as well as future
* expected earnings. However, fortunately, intelligence
* is thought of as a time constant factor.
* Therefore, if we remove time constant factors we might
* be able to approximate the returns to education.
* (This is assuming the returns to years of education is
* constant. If it is a function of intelligence then
* we are going to need to think about being more clever
* about this.)
* Stata code
* Imagine we have 200 individuals that we track
set obs 200
set seed 101
gen c = rnormal()
label var c "Time Constant Heterogeniety (individual specific)"
gen id = _n
label var id "Individual specific ID"
* create 5 observations for each initial observation
bysort id: gen year=_n
label var year "Year of observation"
gen x = rnormal()+c
label var x "explanatory variable X (with time constant and time varying components)"
gen u = 3*rnormal()+3*c
label var u "Error term (correlated with unobservables c)"
gen y = x + u
label var y "Outcome variable"
reg y x
* We can see that OLS is biased
xtset id year
* Tells stata to use id as a panel data individual identifier
xtreg y x, fe
* However, the fixed effect estimator is unbiased because it
* successfully eliminates the correlation between the time
* constant correlation between the x and the error u.
* Note: an identical command is:
reg y x i.id
areg y x, absorb(id)
* An alternative approach is the Chamberlain Munlack device.
* If we fear that the constant part of x might be correlated
* with u then we can easily control for that by including
* it in the regression:
bysort id: egen x_mean = mean(x)
reg y x x_mean
* When there is only two time periods difference in difference
* the same but in time periods more than two it tends to be
* different. Though it is also effective at removing time
* constant effects.
gen y_diff = y-l.y
gen x_diff = x-l.x
reg y_diff x_diff