Tuesday, July 31, 2012
3 ways to write a probit command
* Probit is a commonly used command to model binary outcome variables.
* The maximum likelihood sytax in Stata is specific
* I will first define three different equivalent ml programs. Then we will solve them to see how they perform.
cap program drop myprobit1
program define myprobit1
* Tell Stata that this code is written in version 11
version 11
* Define the arguments input in the maximum likelihood function
* Stata automatically generates the first argument as a temporary variable name
* that is used to store the natural log of the likelihood value of the likelihood function.
args ln_likehood xb
* The probit tries to directly maximize the probability of seeing the outcome (either y==1 or y==0)
* If y==1 then we maximize the probability of seeing a positive outcome
qui replace `ln_likehood' = ln( normal(`xb')) if $ML_y==1
* If y==0 then we want to maximize the probability of seeing a negative outcome
qui replace `ln_likehood' = ln(1-normal(`xb')) if $ML_y==0
end
cap program drop myprobit2
program define myprobit2
version 11
args ln_likehood xb
* Because of the symetry of the normal CDF the following set of expressions is equivalent.
qui replace `ln_likehood' = ln(normal( `xb' )) if $ML_y==1
qui replace `ln_likehood' = ln(normal(-`xb' )) if $ML_y==0
end
cap program drop myprobit3
program define myprobit3
version 11
args ln_likehood xb
* This is a somewhat opeque way of writing the same thing as above.
* I do not prefer it to the first two forms except that it is likely to run slightly faster
* since it is a single command.
qui replace `ln_likehood' = ln(normal((-1)^(1-$ML_y)*`xb'))
end
clear
set obs 1000
gen x1 = rnormal()*.5
gen x2 = rnormal()*.5
gen u = rnormal()*(.5^.5)
gen p=normal(x1+x2+u)
gen y=rbinomial(1,p)
probit y x1 x2
* This is the benchmark
ml model lf myprobit1 (y=x1 x2)
ml maximize
* Basically works the same except there is no pseudo r2 reported.
timer on 1
ml model lf myprobit2 (y=x1 x2)
ml maximize
timer off 1
timer on 2
ml model lf myprobit3 (y=x1 x2)
ml maximize
timer off 2
timer list
di "myprobit2 is " r(t1)/r(t2) " times slower than myprobit3"
* The following is a brief exploration attempting to run two probits at once.
* It shows how ml can optimize two equations jointly
cap program drop mytwoprobit1
program define mytwoprobit1
version 11
args ln_likehood xb1 xb2
qui replace `ln_likehood' = ln(normal((-1)^(1-$ML_y1)*`xb1')) ///
+ ln(normal((-1)^(1-$ML_y2)*`xb2'))
end
* This is equivalent to the previous except that it uses the binormal distribution rather than two seperate normals. I tried specifying the correlation as a third equation but that did not work. I will need to play around a bit more to figure out how to program a biprobit.
cap program drop mytwoprobit2
program define mytwoprobit2
version 11
args ln_likehood xb1 xb2
qui replace `ln_likehood' = ln(binormal((-1)^(1-$ML_y1)*`xb1', ///
(-1)^(1-$ML_y2)*`xb2', ///
0))
end
clear
set obs 1000
gen x1 = rnormal()*.5
gen x2 = rnormal()*.5
gen p1=normal(x1-x2)
gen p2=normal(-x1+x2)
gen y1=rbinomial(1,p1)
gen y2=rbinomial(1,p2)
ml model lf mytwoprobit1 (y1:y1=x1 x2) (y2:y2=x1 x2)
ml maximize
ml model lf mytwoprobit2 (y1:y1=x1 x2) (y2:y2=x1 x2)
ml maximize
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment