## Tuesday, July 3, 2012

### Post estimation counterfactual biprobit draws

* This Stata program offers the ability to generate counterfactual draws post estimation of a biprobit.   These draws are unique because they reflect not only the actual data used in the original estimation technique but also modifications to that data that the researcher may want to simulation.

* For this purposes I use probabilities to generate counterfactual data to be used in the simulation.  In the later step when the xb1 and xb2 are generated it is possible for the user to change these values by say adding 1 to one of the independent variables and seeing how the bivariate outcomes change as a result.  This result can be expanded beyond just finding the marginal effect since the user may be also interested in knowing how the joint likelihood changes which cannot be found easily by looking just at the marginal coefficients on Y1 and Y2.

* Some of the features of this code are available in the mvprobit command.  Which inspired this code.
* http://ideas.repec.org/c/boc/bocode/s432601.html written by Lorenzo Cappellari & Stephen P. Jenkins

* This code is written by Francis Smart

* Just to set up let's load a default data set.
webuse school, clear

local binary1 = "private"
local binary2 = "vote"

local indep_var = "logptax loginc years"

* We want to be able to draw probabilities that will allow us to get close to the true probabilities.
gen ptarget00 = 0
gen ptarget10 = 0
gen ptarget01 = 0
gen ptarget11 = 0

replace ptarget00 = 1 if binary1' == 0 & binary2' == 0
replace ptarget01 = 1 if binary1' == 0 & binary2' == 1
replace ptarget10 = 1 if binary1' == 1 & binary2' == 0
replace ptarget11 = 1 if binary1' == 1 & binary2' == 1

graph pie, over(binary1' binary2') title(binary1' and binary2' in Data) name(pie1, replace)

sum ptarget*

* First the bivariate probit.  That is we are trying to predict two binary outcomes simultaneously.
biprobit binary1' binary2' indep_var'

* This actual data is completely arbitrary and the relationships are not reasonable.  It is just a default webuse data set.

* Now let's inagine that we would like to estimate a bivariate probit then simulate random draws that that correspond with the bivariate probit.

* First let us predict the xb values
predict xb1, xb equation(#1)
predict xb2, xb equation(#2)

* Change the xb1 and xb2 values in order to see how the joint distribution of draws will change.

* This is the prefix that is going to be in front of all of the generated probabilities
local pref = "phat_"

* This is the estimated correlation between the positive draws.
local rho = e(rho)

local xb1 = "xb1"
local xb2 = "xb2"

* Create a correlation matrix between dependent var1 and dependent var2
tempname A C
mat A' = I(2)
mat A'[1,2] = (rho')
mat A'[2,1] = (rho')

* Decompose A into two matrices
mat C' = cholesky(A')

* Create a scalar (local) extract from the cholesky decomposition
forval j = 1/2 {
tempname c2j'
scalar c2j'' = C'[2,j']
}
* These values are used to adjust the second generated probability

* Define the beginning of the equation loop
***************************************
tempvar d11 d01 sp1x sp0x arg111 arg001 z

qui gen double arg111' = xb1'
qui gen double arg001' = xb1'

qui gen z' = uniform()

qui gen double d11' =  invnorm(z'*normprob( arg111'))
qui gen double d01' = -invnorm(z'*normprob(-arg001'))

* Generate the probability of seeing a draw of 1 on the first equation
qui gen double sp1x' = normprob( arg111')

* Generate the probability of seeing a draw of 0 on the first equation
qui gen double sp0x' = normprob(-arg001')

***************************************
tempvar spx1 spx0 arg112 arg002

qui gen double arg112' = xb2'-d11'*c21'
qui gen double arg002' = xb2'-d01'*c21'

* Generate the probability of seeing a draw of 1 on the first equation
qui gen double spx1' = normprob((arg112')/c22')

* Generate the probability of seeing a draw of 0 on the first equation
qui gen double spx0' = normprob((-arg002')/c22')

***************************************
* Start cross equation calculation

tempvar sp01 sp10 arg102 arg012

qui gen double arg102' = xb2'+d11'*c21'
qui gen double arg012' = xb2'+d01'*c21'

* Generate the probability of seeing a draw of 1 on the first equation
qui gen double sp01' = normprob((arg102')/c22')

* Generate the probability of seeing a draw of 0 on the first equation
qui gen double sp10' = normprob((-arg012')/c22')

***************************************
* End equation loop

gen pref'00 =  sp0x'*spx0'
gen pref'01 =  sp1x'*sp10'
gen pref'10 =  sp0x'*sp01'
gen pref'11 =  sp1x'*spx1'

gen pref'_sum = pref'11 + pref'00 + pref'10 + pref'01

foreach i in "00" "10" "01" "11" {
* Let's make sure all of the generated probabilities add up to one.
replace pref'i' = pref'i'/pref'_sum

* This is just a temporary place holder for the draws
gen pref'drawi'=0
}

drop pref'_sum

* Generate ranges for which each outcome will occure
gen outcome00 = pref'00
gen outcome01 = pref'01+pref'00
gen outcome10 = pref'10+pref'01+pref'00
gen outcome11 = pref'11+pref'10+pref'01+pref'00

* Make the draw
gen runit = runiform()

replace pref'draw00=1 if runit < outcome00
replace pref'draw10=1 if outcome00 < runit & runit < outcome01
replace pref'draw01=1 if outcome01 < runit & runit < outcome10
replace pref'draw11=1 if outcome10 < runit & runit < outcome11

sum *draw??, detail

* Make the draws of the dependent vars
gen binary1'_draw=0
replace binary1'_draw=1 if pref'draw10==1 | pref'draw11==1

gen binary2'_draw=0
replace binary2'_draw=1 if pref'draw01==1 | pref'draw11==1

graph pie, over(binary1'_draw binary2'_draw) ///
title(binary1' and binary2' in Bivariate Draws) ///
name(pie2, replace)

graph combine pie1 pie2
* The random bivariate draws are not a perfect replica of the original data

* Compare this to two seperate probit estimations
probit binary1' indep_var'
predict binary1'_p
gen  binary1'_prob_draw = rbinomial(1, binary1'_p)

probit binary2' indep_var'
predict binary2'_p
gen  binary2'_prob_draw = rbinomial(1, binary2'_p)

graph pie, over(binary1'_prob_draw binary2'_prob_draw) ///
title(binary1' and `binary2' in Two Probits) ///
name(pie3, replace)

graph combine pie1 pie2 pie3
* However, they are much better than running two seperate probits and drawing from them.