Sunday, July 8, 2012
* Stata's predict command is an extremely useful command for many purposes.
* In this post we will go through how it works. And manually program in long hand some of the things it does.
* First let's start with OLS
* Imagine the underlying population model Y = g(x1, x2)
* Now imagine an estimator Y = f(x1, x2)
* What most estimations do is they take the Y and the xs and estimate some variant of f.
* In the linear case Y = b0 + b1x1 + b2x2 + u
* Most estimation commands attempt to estimate b0, b1, and b2. Which is great!
* But after estiamting b0, b1, and b2 what we may ask,
* "How does u look? Does it look normal, thus justifying the use of OLS?"
* We may also ask, "How does the estimated y look? This is often not particularly interesting since it is purely linear but often 'yhat' the predicted y is used in post estimation techniques."
* Let's see how this works.
* First, let's simulate some data:
set seed 10
set obs 1000
gen x1 = rnormal()
gen x2 = rnormal()
gen u = rnormal()
gen y = 6*x1 + -4*x2 + 10*u
* Now let's estimate the OLS equation
reg y x1 x2
* If we want to get the fitted values we need only write the following
* This is equivalent in the OLS case to:
predict yhat2, xb
* We can also manually generate these values by using the estimated coefficients:
gen yhat3 = .1430927 + 5.767773*x1 + -3.798869*x2
* Likewise we may be interested in the error uhat
predict uhat1, residual
* We can do it manually:
* There is the slightest difference between uhat1 and uhat2 but this is only the result of rounding error.
* Now that we have uhat, we can map it out to see if it looks like it is behaving well:
hist uhat1, kden
* Unsprisingly (given that we drew u from normal) uhat1 (which is an estimate of u) looks normal as well.