Monday, January 28, 2013

HLM comparison with OLS - 2 levels, random coefficient on constant

do file

* xtmixed is capable of estimating a variance for the random effects on multiple levels.

* random effects are normalized to have mean 0.

* Our initial model is y_ij = bij + xij*B1 + uij

* With bij = B0 + v0j

* We can represent our model as:

* y_ij = B0 + v0j + xij*B1 + uij = B0 + xij*B1 + v0j + uij

* With v0j + uij an unobserved error term.

* Let's see an example.

clear

set obs 60
* We will have 60 different schools (level 2 indexes)

gen school=_n

gen v0j = rnormal()*2
* Generate some school effect with standard deviation = 2

expand 200
* There are 200 individuals in each school.

gen x1 = rnormal()
* Each individual has a continuous predictor (say academic performance).

gen u = rnormal()*5
* each individual has an unobserved error.

gen y = 1 + x1*2 + v0j + u
* Each individual's outcomes are a function of individual ability plus the school effect plus individual error.

xtmixed y x1 || school:
* Now let's estimate both the effect of x1 on y as well as the effect of school variation on predicting outcomes.

* The standard deviation of the estimate of school effect is close to 2 which is the true so xtmixed is working well.

* In principal we can attempt to estimate the same thing using OLS with dummies.

qui tab school, gen(sch_id)
* Generate a set of dummy indicator variables for each school.

* School 1 is left out of the estimation as the reference school.
reg y x sch_id2-sch_id60

predict u_hat, resid

* We would like to calculate the standard deviations of the school estimates in order to compare with the xtmixed standard deviation estimates.

* Since we left school 1 out of the estimation it is considered a 0 effect.  All other school effect estimates are relative to school 0.
gen sch_coef_est = 0 if school==1

* The following code will save the school estimates to the coefficient variable that can then be used to find the standard deviation of the estimated school effects.
forv i=2/60 {
  qui replace sch_coef_est = _b[sch_id`i'] if school==`i'
}

sum sch_coef_est u

* We can see that the standard deviation of sch_coef_est is similar to that estimated by xtmixed as well as the standard deviation of the residuals.

* So both methods seem to be effective at estimating the variance of the school level effect.

corr sch_coef_est v0j

* We can also see that the OLS dummy variable method has produced individual school effects that highly correlated with the true school effects.

* In summary both methods seem to work well though we would probably favor the xtmixed (hierarchical linear model) estimator if we did not care about the actual individual school estimates because it provides an estimator that maintains more degrees of freedom.

No comments:

Post a Comment