Thursday, May 31, 2012

Value-added modelling - Stata simulation - iPad example

* Value-added modelling is a common approach to use to try to infer the "quality" or "value" of inputs.

* In education, these methods have become quite popular politically and academically.

* But in a sense using these methods in education is an abstraction.

* Let us first in order to grasp how value added methods were first developed think of a production example.

* Imagine that you are manufacturing tablet computers (iPad).

* On each step of the production process there is a different "value" that is added to the process as a result the particular inputs.

* In order to conceptualize this think of the product at each stage being worth a certain value that is then bought buy the next person in the manufacturing chain.

* However, you do not want to use the market to assemble the tablet computers.

* You want to assemble them in house.

* Therefore you want to figure out a way of inferring how much value each input has.

* To do this, imagine that run a series of non-market valuations after each stage of the production process to infer a current price.

* Then you take the change in that market price as a measure of the value of those inputs.

set seed 11

* Let's do this:

clear
set obs 10

* 10 different companies

gen comp_id = mod(_n-1,10)+1
  label var comp_id "Company ID"

* Each of the companies has a different assembly line structure.
* Let's imagine that at each stage that company's structure ads a constant amount of value to all products.

gen comp_fe = runiform()
  label var comp_fe "The fixed effect (specific to that company) added to each product each stage"

* Now let's imagine that each company produces 10 different products (over the sample time)
expand 10

sort comp_id

* Generate a list of product ids
gen prod_id = _n
  label var prod_id "Product ID"

* Each product line has some inherent design component that makes it more or less valued at each progressive stage than other products.
gen prod_fe = rnormal()/2 + 1/4
  label var prod_fe "Product Fixed Effect"

* Now imagine that there are 5 stages of production for each product
expand 5
bysort prod_id: gen prod_stage = _n

* You have different stages with the initial stage product idea.
* Stage 1 gather raw materials
* Stage 2 manufacture components
* Stage 3 assemble components (probably in China)
* Stage 4 ship product
* Stage 5 sell product at the retail locations



label var prod_stage "Production stage"

* Now imagine also that there are 100 contractors in tablet computer production market.

* These contractors get random contracts as to which product to work on.

* This is what we really want to know.

* How good are these contractors at "adding value" to the product.

* This is the trickiest part of the code so far.

* First we need to generate the contractors.

* Keep the data that we have generated so far:
preserve

* There are several ways of doing this.

* I will use the many:1 merge command to accomplish this task.
clear

* Imagine that our 100 contractors are subsidiaries of 5 different umbrella companies.

set obs 5

gen cont_company_id=_n
  label var cont_company_id "Contracting company ID"

* Each of these companies has a different work ethic
gen cont_company_fe = rnormal()*.25
  label var cont_company_fe "Contracting company effectiveness"

* Each of the contracting companies has 20 contractors they manage.

expand 20

gen cont_id = _n
  label var cont_id "Contractor ID"

gen cont_fe = rnormal() + 1
  label var cont_fe "Contractor effectiveness"

* Now we will save the contractor information to a temporary data file
save "contractor.dta", replace

restore

* Let us first assign contractor IDs randomly:

gen cont_id=int(runiform()*100+1)

* Now let's merger in the contractor data

merge m:1 cont_id using "contractor.dta"
drop _merge

* Let us think that whenever a different product is developed there is some unobserved component that adds or subtracts random value from a product line independent of all other inputs at each stage.

gen rand_effect = rnormal()
  label var rand_effect "Random idiosyncratic production effect unique to each product at each stage"
* In other words the error component

* Finally, imagine that at each stage there is an "average" amount of value added at that stage.

* We will use another merge command to do this:

preserve

clear
set obs 5

gen prod_stage=_n

gen stage_fe = runiform()*2
  label var stage_fe "Production stage fixed effect"

save "Stage.dta", replace

restore

merge m:1 prod_stage using "Stage.dta"
drop _merge

* Let us first add on a production stage zero representing the initial "value" of the product idea.

* This is a little tricky to do.

expand 2 if prod_stage == 1, generate(expand_indicator)

* I will expand the data for production stage 1 and indicate the created data with expand indicator.
replace prod_stage = 0 if expand_indicator==1
drop expand_indicator

* Now I want to make sure that stage 0 production is not done by any contractors so there is only the company effect and product effect.

foreach v in cont_id cont_company_id cont_company_fe cont_fe rand_effect stage_fe {
  replace `v' = 0 if prod_stage == 0
}

*****
* Now let us start to generate the values of the products at each stage.

* This is a cumulative model ie. Value Added

* So in effect y=lambda*y[t-1] + XB + v

* To begin with we will generate the initial value of y.

gen value=abs(rnormal()) + prod_fe +  comp_fe if stage==0

* First let us double check to make sure our data is sorted properly.
sort prod_id prod_stage

* This is the retention value of a product from each previous stage.

* If lambda is low then it means that once the product is processed it cannot be used in the previous production stage.

* If lambda is high then it means that the previous value is retained plus any value added of progressive stages.
gen lambda=.95

* Now let us generate the cumulative value added data.
replace value=lambda*value[_n-1] + prod_fe +  comp_fe + cont_fe +cont_company_fe + stage_fe + rand_effect if stage>0

**** Simulation END

bysort prod_stage: sum value
* We can see that on average at each production stage there is an increasing value of the product.
* However, we can also see that the variance in values increases as the value increases.
* This is because of the cumulative variance effect of the various components combined with the high retention value of each previous stage (lambda).

* Now that we have data generated through a Value-added simulation we can start testing different value added estimators.

* That will be for a later post!

No comments:

Post a Comment