## Tuesday, October 30, 2012

### Regression Analysis - OLS

* Often I simulate problems in order to verify that the method is working as I expect it to.

* Let's first set the number of observations

clear
set obs 10000

gen x = runiform()
gen u = rnormal()

gen y = 2*x + 4*u

* To verify that we can actually estimate the coefficient on x (2) from OLS we can simply run it.

reg y x

* This is not a proof that OLS works but rather a simple test that can indicate that it might not be working.

* A simple example to show that OLS might not be working properly is:

reg y x if y>0

* OLS cannot be assumed to be unbiased when the dependent variable is censored.  If we did not know this already, we could use the previous regression to demonstrate that there is probably a problem.

* Of course a single regression would not be enough.

* We would need a monte carlos simulation to show biasedness (or a mathematical proof of course)

* A simple program can be written as such using the above code.

capture program drop censoredOLS
program censoredOLS
clear
set obs 10000
gen x = runiform()
gen u = rnormal()
gen y = 2*x + 4*u
reg y x if y>0
end

* It is useful to know that the coefficient from the regression can be targeting with the _b[x] command.

simulate b_x = _b[x], reps(1000): censoredOLS
sum b_x

* We can see from the simulation that the individual low estimate on the coefficient of x was not a fluke.