Thursday, June 28, 2012
Cragg's Double hurdle model used to explain censoring
* Cragg's 1971 lognormal hurdle (LH) model
* (See Wooldridge 2010 page 694)
* With a double hurdle model we want to think that there are two compents contributing to a process.
* First, there is the decision to do something. Say go to market and sell produce. Second, there is the decision of how much produce to sell.
* This model is distinctly different than say a truncated normal regression because the truncated normal regression assumes linearity in y variableand only allows for the y variable to be kept above zero becuase the data is truncated. Ie. people cannot sell negative quantities of produce (buy) in the data set.
* What instead we want to think about is that both decisions are independent (conditional on observables) and unique decisions. This might seem unreasonable at first. However, in general if you are a grower you are probably going to decide to sell on the market ex ante to how much to sell. That is, if you plant only cabbage then you probably are not going to settle for eating cabbage no matter how your crop turns out.
* Likewise if you are going to sell, the decision of how much to sell, may be uncorrelated with the original decision to sell if say you plan to sell on the market at the time of planting and decide how much to sell based on how well your crop does.
* It is easy to also think of example of when this assumption might fail. For instance, if for some reason there is a bumper crop one year which causes you to have more than you can eat, even though you did not plan on it, you go to the market to sell. The amount you sell on the market will then be probably a function also of the bumper crop.
* But for now let us imagine the two decisions are independent given observables.
* Sell or not:
* s=1[xg + v >0]
* How much to sell:
* w=exp(xB + u)
* The conditional indepence assumption can be written:
* The APE of the
set obs 1000
set seed 101
gen x1 = rnormal()
label var x1 "Amount of fertilizer used"
gen x2 = rnormal()
label var x2 "Index indicating availablility of short term credit"
gen x3 = rnormal()
label var x3 "Distance from city center"
gen u = rnormal()
* Error term
* I want the variance to equal 1 in total
* Since xs and u are independent and rormally distributed with a variance of 1:
* var(a*x1+b*x2+c*x3+d*u) = a^2*(1) + b^2*(1) + c^2*(1) + d^2*(1)
* = a^2 + b^2 + c^2 + d^2 = 1
* Let's just make the variance of all 4 variables equal to v
* = v^2 + v^2 + v^2 + v^2 = 1
* = 4*v^2 = 1 -> v^2 = 1/4
* -> v = +/- 1/2
* Generate the probability of selling. Use normal density.
gen s_inv = -1/2 + .5*x1 + .5*x2 - .5*x3 + .5*u
* It s_inv has a variance close to 1
gen s_prob = normal(s_inv)
sum s_prob, detail
gen s = rbinomial(1,s_prob)
gen s = 1 if s_prob>.5
replace s = 0 if s_prob<=.5
label var s "Decision to sell on the market"
* This draws a response s for every individual
* It is the decision to sell on the market or
gen v = rnormal()
* Error term
gen w = 5 + 2*x1 + 3*x2 + 4*x3 + v*10
* Quantity of produce that this farmer would
* have sold if he went to the market.
* We want to make sure w does not go negative.
* We can ensure this by making sure the minimum of w is 0.
replace w = max(w,0)
* There should be very few draws.
label var y "How much produce is sold"
* Quantity of produce actually sold.
* Now let's drop the variables we do cannot
* actually observe in real data:
* Now the problem is that what we observe is:
* y=s*w=1[xG + v >0](xB + u)
* The unconditional partial effect of x on sales y is:
* dy/dx = s'w + s*w'
* The effect has two addative components.
* The marginal effect of x on the change in the probability of sales at the current quantity of sales and the marginal quantity of sales given the probability in sales.
* With dy/dx this is a unique value for each person, therefore in order to test how well our estimator is working, we will calculate it per person then take the average.
* This average is the analogue to the average partial effect that will will attempt to estimate.
* We know from the way s_prob was calculated the values for sprime
* gen s_prob = normal(-1/2 + x1 + 2*x2 - x3 + .1*u)
* by chain rule. CDF=normal(), PDF=normalden()
* ds/dx1 = 1*normalden(s_inv)
* ds/dx2 = 2*normalden(s_inv)
* ds/dx3 = -1*normalden(s_inv)
* gen w = 5 + 2*x1 + 3*x2 + 4*x3 + v
* dw/dx1 = 2
* dw/dx2 = 3
* dw/dx3 = 4
* dy/dx = s'w + s*w'
gen dydx1 = .5*normalden(s_inv)*w + s*2
gen dydx2 = .5*normalden(s_inv)*w + s*3
gen dydx3 = -.5*normalden(s_inv)*w + s*4
* We can see that the unconditional effect of x1 and x2 is greater than x3.
* This is because probability of selling is going the opposit direction of sale quantity.
* Begin estimation
probit s x?
* Recovers the coefficients pretty well.
reg w x? if s==1
* However, does not work so well.
* The only problem is that we do not observe s.
gen s_hat = 0
replace s_hat = 1 if y>0
probit s_hat x?
reg y x? if s_hat==1
* Now we can see that both estimates are biased
* Many people may assume that the tobit would be the correct model due to knowing that sales cannot be negative.
tobit y x?, ll(0)
* The tobit left sensoring model clearly fails pretty spectaculary at recovering the true marginal effects. This is because the tobit does not take into account the two stage nature of the quantity to sell decision.
* The bias in using tobit is particulary pronounced when looking at x3 where we know that though the effect of x3 on quantity sold is the largest because it decreases the likelihood of selling at all, the coefficient actually turns out to be negative.* You will need to use the user written command by William Burke:
* You should be able to install it using the findit craggit command
craggit s_hat x?, second(y x?)
* This looks pretty good.