Tuesday, July 3, 2012

Post estimation counterfactual biprobit draws

* This Stata program offers the ability to generate counterfactual draws post estimation of a biprobit.   These draws are unique because they reflect not only the actual data used in the original estimation technique but also modifications to that data that the researcher may want to simulation.  


* For this purposes I use probabilities to generate counterfactual data to be used in the simulation.  In the later step when the xb1 and xb2 are generated it is possible for the user to change these values by say adding 1 to one of the independent variables and seeing how the bivariate outcomes change as a result.  This result can be expanded beyond just finding the marginal effect since the user may be also interested in knowing how the joint likelihood changes which cannot be found easily by looking just at the marginal coefficients on Y1 and Y2.

* Some of the features of this code are available in the mvprobit command.  Which inspired this code.
* http://ideas.repec.org/c/boc/bocode/s432601.html written by Lorenzo Cappellari & Stephen P. Jenkins


* This code is written by Francis Smart

* Just to set up let's load a default data set.
webuse school, clear

local binary1 = "private"
local binary2 = "vote"

local indep_var = "logptax loginc years"

* We want to be able to draw probabilities that will allow us to get close to the true probabilities.
gen ptarget00 = 0
gen ptarget10 = 0
gen ptarget01 = 0
gen ptarget11 = 0

replace ptarget00 = 1 if `binary1' == 0 & `binary2' == 0
replace ptarget01 = 1 if `binary1' == 0 & `binary2' == 1
replace ptarget10 = 1 if `binary1' == 1 & `binary2' == 0
replace ptarget11 = 1 if `binary1' == 1 & `binary2' == 1

graph pie, over(`binary1' `binary2') title(`binary1' and `binary2' in Data) name(pie1, replace)

sum ptarget*

* First the bivariate probit.  That is we are trying to predict two binary outcomes simultaneously.
biprobit `binary1' `binary2' `indep_var'

* This actual data is completely arbitrary and the relationships are not reasonable.  It is just a default webuse data set.

* Now let's inagine that we would like to estimate a bivariate probit then simulate random draws that that correspond with the bivariate probit.

* First let us predict the xb values
predict xb1, xb equation(#1)
predict xb2, xb equation(#2)

* Change the xb1 and xb2 values in order to see how the joint distribution of draws will change.

* This is the prefix that is going to be in front of all of the generated probabilities
local pref = "phat_"

* This is the estimated correlation between the positive draws.
local rho = e(rho)

local xb1 = "xb1"
local xb2 = "xb2"

* Create a correlation matrix between dependent var1 and dependent var2
tempname A C
mat `A' = I(2)
mat `A'[1,2] = (`rho')
mat `A'[2,1] = (`rho')

* Decompose A into two matrices
mat `C' = cholesky(`A')

* Create a scalar (local) extract from the cholesky decomposition
forval j = 1/2 {
tempname c2`j'
scalar `c2`j'' = `C'[2,`j']
}
* These values are used to adjust the second generated probability

* Define the beginning of the equation loop
***************************************
tempvar d11 d01 sp1x sp0x arg111 arg001 z

qui gen double `arg111' = `xb1'
qui gen double `arg001' = `xb1'

qui gen `z' = uniform()

qui gen double `d11' =  invnorm(`z'*normprob( `arg111'))
qui gen double `d01' = -invnorm(`z'*normprob(-`arg001'))

* Generate the probability of seeing a draw of 1 on the first equation
qui gen double `sp1x' = normprob( `arg111')

* Generate the probability of seeing a draw of 0 on the first equation
qui gen double `sp0x' = normprob(-`arg001')

***************************************
tempvar spx1 spx0 arg112 arg002

qui gen double `arg112' = `xb2'-`d11'*`c21'
qui gen double `arg002' = `xb2'-`d01'*`c21'

* Generate the probability of seeing a draw of 1 on the first equation
qui gen double `spx1' = normprob((`arg112')/`c22')

* Generate the probability of seeing a draw of 0 on the first equation
qui gen double `spx0' = normprob((-`arg002')/`c22')

***************************************
* Start cross equation calculation

tempvar sp01 sp10 arg102 arg012

qui gen double `arg102' = `xb2'+`d11'*`c21'
qui gen double `arg012' = `xb2'+`d01'*`c21'

* Generate the probability of seeing a draw of 1 on the first equation
qui gen double `sp01' = normprob((`arg102')/`c22')

* Generate the probability of seeing a draw of 0 on the first equation
qui gen double `sp10' = normprob((-`arg012')/`c22')

***************************************
* End equation loop

gen `pref'00 =  `sp0x'*`spx0'
gen `pref'01 =  `sp1x'*`sp10'
gen `pref'10 =  `sp0x'*`sp01'
gen `pref'11 =  `sp1x'*`spx1'

gen `pref'_sum = `pref'11 + `pref'00 + `pref'10 + `pref'01

foreach i in "00" "10" "01" "11" {
* Let's make sure all of the generated probabilities add up to one.
  replace `pref'`i' = `pref'`i'/`pref'_sum

  * This is just a temporary place holder for the draws
  gen `pref'draw`i'=0
}

drop `pref'_sum

* Generate ranges for which each outcome will occure
gen outcome00 = `pref'00
gen outcome01 = `pref'01+`pref'00
gen outcome10 = `pref'10+`pref'01+`pref'00
gen outcome11 = `pref'11+`pref'10+`pref'01+`pref'00

* Make the draw
gen runit = runiform()

replace `pref'draw00=1 if runit < outcome00
replace `pref'draw10=1 if outcome00 < runit & runit < outcome01
replace `pref'draw01=1 if outcome01 < runit & runit < outcome10
replace `pref'draw11=1 if outcome10 < runit & runit < outcome11

sum *draw??, detail

* Make the draws of the dependent vars
gen `binary1'_draw=0
replace `binary1'_draw=1 if `pref'draw10==1 | `pref'draw11==1

gen `binary2'_draw=0
replace `binary2'_draw=1 if `pref'draw01==1 | `pref'draw11==1

graph pie, over(`binary1'_draw `binary2'_draw) ///
  title(`binary1' and `binary2' in Bivariate Draws) ///
  name(pie2, replace)

graph combine pie1 pie2
* The random bivariate draws are not a perfect replica of the original data



* Compare this to two seperate probit estimations
probit `binary1' `indep_var'
predict `binary1'_p
gen  `binary1'_prob_draw = rbinomial(1, `binary1'_p)

probit `binary2' `indep_var'
predict `binary2'_p
gen  `binary2'_prob_draw = rbinomial(1, `binary2'_p)

graph pie, over(`binary1'_prob_draw `binary2'_prob_draw) ///
  title(`binary1' and `binary2' in Two Probits) ///
  name(pie3, replace)

graph combine pie1 pie2 pie3
* However, they are much better than running two seperate probits and drawing from them.

No comments:

Post a Comment