Wednesday, August 22, 2012

Sharp Regression Discontinuity Example and Limitations

* Regression discontinuity is a method of analysis dating back to work by Thistlewait and Cook (1960) but recently popularized by a number of important papers such as Hahn, Todd, and Van der Klaauw (2001) <http://ideas.repec.org/a/ecm/emetrp/v69y2001i1p201-09.html.

* The method is argued to require weaker assumptions than natural experiments.

* The method is seemingly deceptively simple.

* Imagine there is some rule implemented at z = c a heterogenous value in the population.

* z could be correlated with the outcome variable y of interest.  However, if one were to look at the group of individuals whose value of z were sufficiently close to c then one would find that the only remaining difference would be the result of either recieving the treatment T or not recieving the treatment.

* Let's see how this works.

clear
set obs 20000
set seed 101

* Imagine that a school has 20000 incoming students.

* They have SAT scores drawn from a uniform distribution ( not a realistic assumption)
gen SAT = 600 + int(181*uniform())*10

* As an administrator you would like to give out merit based scholarships to encourage students to do well at your school.

* This we will call a score of 2130.

recode SAT (0/2130=0) (2130/2400=1), gen(scholarship)

* There is also some measurable level of mentoring that affects performance which is independent of SAT scores and scholarship.
gen mentoring = rbinomial(1,.5)

* Let's also imagine that students with top SAT scores are more likely to do well without the scholarship.
gen performance = 25 + 2*(SAT/1500)^3 + 1*scholarship + 1*mentoring + rnormal()*5

twoway (scatter performance SAT , msize(tiny) msymbol(circle))  ///
(lfit  performance SAT if SAT<2130 circle="circle" msize="msize" msymbol="msymbol" nbsp="nbsp" p="p" tiny="tiny">   (lfit performance SAT if SAT>2130, msize(tiny) msymbol(circle)),    ///
legend(label(2 "No Scholarship") label(3 "Scholarship"))

* Performance is some index that your team has developed that combines grades, time to completion, post graduation job success, entry to graduate schools, as well as alumni contributions.

* You think that the relationship between SAT and performance is nonlinear
* Sepecification 0:
reg performance SAT scholarship mentoring

* However, you are suspicious of the relationship between students recieving the scholarship and future success.

* Thus RD! You look instead at those students who almost got the scholarship and those who just barely qualified for the scholarship.

* Sepecification 1:
reg performance SAT scholarship mentoring if SAT > 1930 & SAT < 2330

* We can see that our estimates are closer.  However they are not perfect yet it does not help if we restrict our data further.
* Sepecification 2:
reg performance SAT scholarship mentoring if SAT > 2070 & SAT < 2180

* Sepecification 3:
reg performance SAT scholarship mentoring if SAT > 2100 & SAT < 2160
* This is because we rapidly loose observations as our

* Sepecification 4:
reg performance SAT scholarship mentoring if SAT > 2115 & SAT < 2145

* It is easy to see the problem with the RD approach in this example.  RD is highly sensitive to sample selection.

* If our selection is too narrow then we do not have enough data to identify the effect or the confidence interval for useful analysis.

* In 0 we can see that our confidence interval does not enclose the true parameter estimate, while in all other specifications it does.

* However, from specifications 2-4 we might be forced to conclude that after using RD "precision", the scholarship had no effect size significantly difference from zero.

* We can that the estimated effect of mentoring suffers equally bad as that of estimating the effect of scholarship by restricting the sample.

* Ultimately using RD to restrict the sample is going to have an equally deliterious effect on other coefficients of interest.

* This might sound overall negative.  However, when you have large enough sample sizes things start improving.

* Let's imagine that the administrator is able to use multiple years of students to estimate the effect of the program.

* Increase sample size to 2 or 3 times the current sample size and include a year fixed effect and the RD estimator is still going to outperform the biased estimator which cannot get better through the inclusion of more data.