* That is med(f(x))=f(med(x)) so long as f' > = 0
* LAD is is invariant to non-decreasing transformations.
set seed 110
set obs 10000
gen x = rnormal()*8+6
* Because x is symetric around 1 we know the median is 1
sum x, detail
gen fx = sign(x)*x^2+500
* fx is a non-decreasing function we can see this by ploting fx against x
line fx x, sort
* Likewise the median of fx is now easy to find med(f(x))=f(med(x))
* med(f(x))=f(med(x))=f(x=6) = sign(1)*6^2+50 = 536
* We can confirm this:
sum fx, detail
* also: med(f(x))=f(med(x)) so long as f' <= 0 by the symetry of the rank function around 50%
* med(g(x))=g(med(x))=g(x=6) = (-1)*(sign(1)*6^2+5) = -536
gen gx = (-1)*(sign(x)*x^2+500)
sum gx, detail
* Notice that while the medians are mirrors of each other (and equal) despite g' < 0. However the quantile have now reversed order thus quantile(.25)=-quantile(.75).
two (hist fx, color(blue)) (hist gx, color(red)), legend(label(1 "fx") label(2 "gx")) title(Mirror quantiles)
* But what is more interesting to us how well LAD does at estimating the conditional median.
* First let us specify:
gen u = rnormal()*20
gen y = x*10 + u*10
* The conditional median is clearly 10
qreg y x
* And qreg is pretty good at identifying the conditional coefficient as 10.
* Also, because E(u|x)=med(u|x)=0, OLS also identifies the median.
* Thus the following also provides a good estimate
reg y x
* Now let us transform y so that it has larger tails using f(y)=fy:
gen fy = sign(y)*y^2+5
* Let's see how well LAD (least absolute deviations) works
qreg fy x
* But what does this mean?
* How well is the quantile regression working?
* Remember fy = sign(y)*y^2+500
* If fy>0: fy(x) = y(x)^2+500 = (x*10 + u)^2 + 500
* And the conditional effect of x on y is
* fy'(x) = 20*(x*10 + u)
* med(fy'(y)|x) = fy'(med(y|x)) =
* 20*(med(x|x)*10 + med(u|x)) =
* 20*(med(x)*10) =
* 20*(6*10) = 1200
reg fy x
graph twoway (lfitci fy x) ///
(scatter fy x)
* This regression does not work very well even though it has a higher r2.