* This post picks up on the Value Added Methods (model) post previously:

*

**In order to execute this code first run that previous simulation to create a data set that you can use.**

* Now, you that you have that data set you want to recover estimates for the various contractors and the various production stages.

* Let us first set up the panel data:

order prod_id prod_stage

* Tell stata that product id is the panel id and product stage is the time dimension.

xtset prod_id prod_stage

* Now to being with we want to estimate the productivity of each contractor, each producing company, each contracting umbrella company, and each production stage.

* If you can estimate this then you know where to invest your resources or which contractors to hire.

* So you estimate the following equation:

reg value i.comp_id i.cont_id i.cont_company_id i.prod_stage

* Immediately you notice a problem.

* 1. All of the contractors always belong to the same company therefore the contracting company id is perfectly multicolinear with the contractors.

* 2. There is only 5 production stages. They are mutually exclusive. Therefore the estimated value added of the production stages are not absolute values but rather values relative to one omitted value.

* 3. These estimates are not easily compared with the original.

* There are some tricks you can do in order to compare estimates.

* First we need to make a lot of dummy variables:

tab comp_id, gen(comp_id_)

tab cont_id, gen(cont_id_)

tab cont_company_id, gen(cont_company_id_)

tab prod_stage, gen(prod_stage_)

* Now do the above regression but with the dummy variables:

reg value comp_id_* cont_id_* cont_company_id_* prod_stage_*

* Yes this is not very pretty. However, we more easily manipulate these coefficients.

gen comp_fe_hat = .

gen cont_fe_hat = .

gen cont_company_fe_hat = .

gen stage_fe_hat = .

forv i=1/101 {

cap replace comp_fe_hat = _b[comp_id_`i'] if comp_id==`i'

cap replace cont_fe_hat = _b[cont_id_`i'] if cont_id==`i'

cap replace cont_company_fe_hat = _b[cont_company_id_`i'] if cont_company_id==`i'

cap replace stage_fe_hat = _b[prod_stage_`i'] if prod_stage==`i'

}

sum *hat

* Now we have stored a lot of estimates of the effects of various levels of inputs let's see how well or estimates perform relative to the true.

cor comp_fe*

cor cont_fe*

cor cont_company_fe*

cor stage_fe*

* despite a simple regression that is consistent with our knowledge of how the data is generated our estimates are generally pretty bad.

* Let us try a series of less complex regresssions.

reg value comp_id_*

forv i=1/101 {

cap replace comp_fe_hat = _b[comp_id_`i'] if comp_id==`i'

}

cor comp_fe*

* That is looking pretty well.

reg value cont_id_*

forv i=1/101 {

cap replace cont_fe_hat = _b[cont_id_`i'] if cont_id==`i'

}

cor cont_fe*

* We can see that this estimator is still performing quite poorly.

* One way to try to get a better estimate might be throwing the lag of the value into the regression.

gen value_l1 = l1.value

reg value value_l1 cont_id_*

forv i=1/101 {

cap replace cont_fe_hat = _b[cont_id_`i'] if cont_id==`i'

}

cor cont_fe*

* This does not help much.

scatter cont_fe*

* We can see that the estimates of the effectiveness of contractors do not line up with reality much at all.

* Why is that?

* I would like to say that I know why, but the truth is I don't.

* We know that there is a fixed product effect. Perhaps that is throwing off our estimates?

xtreg value value_l1 cont_id_* prod_stage_*, fe

forv i=1/101 {

cap replace cont_fe_hat = _b[cont_id_`i'] if cont_id==`i'

}

cor cont_fe*

* Nope. Well I am stumped at this point but fortunately wiser minds than mine have been working at this problem before me.

* The following paper addresses in detail many of the questions raised by this simulation and others not addressed in this simulation (though not all of them).

* http://vam.educ.msu.edu/wp-content/uploads/2010/11/20120517_Can-Value-Added-Measures-of-Teacher-Performance-be-Trusted-WP2.doc

## No comments:

## Post a Comment