## Sunday, May 6, 2012

### Estimating demand (AIDS) - Part 3

* Estimating demand (AIDS) - Part
* Stata simulation and estimation

* v=max{x,y}{x^a * y^b * z^c} s.t. (W=xPx + yPy + zPz)
* since ln is monotonic and increasing this is the same as:
* max{x,y}{a*ln(x)+b*ln(y)+c*ln(z)} s.t. (W=xPx + yPy + zPz)

* L=a*ln(x)+b*ln(y)+c*ln(y) M(W>=xPx + yPy + zPz)
* Lx:=   a/x + M*Px=0
* Ly:=   b/y + M*Py=0
* Lz:=   c/z + M*Pz=0
* LM:=  W-(xPx + yPy + zPz)=0

* a/xPx             = -M
* b/yPy             = -M
* c/zPz             = -M
* a/xPx             = b/yPy
* xPx/a             = yPy/b
* xPx               = ayPy/b
* zPz               = cyPy/b
* W-(ayPy/b + yPy + cyPy/b) = 0
* W-(a/b + 1 + c/b)yPy = 0
* W-(a/b + b/b + c/b)yPy = 0
* W = ((a+b+c)/b)yPy
* y = W*b/(Py*(a+b+c))
* x = W*a/(Px*(a+b+c))
* z = W*c/(Pz*(a+b+c))

* y                 = W/(Py(a/b + 1))
* because we know the solutions are symetic
* x                 = W/(Px(b/a + 1))

clear

* so Let's start the simulation
* Say we have a cross section with 1000 cities
set seed 101
set obs 10000

* first let us set the parameters a and b, these are not observable
gen a = .3
gen b = .4
gen c = .05

* each city faces a different price.  the .5 is so that no city
* gets a price too close to zero.
gen Px=runiform()*4.2+3
gen Py=runiform()*4.5+1
gen Pz=runiform()*4.2+2

* each city has a different level of wealth
gen W=runiform()*5+1

* Lets generate the errors in X.  We want the errors to be correlated
* which can be gotten at by a little trick but I also wanted the errors
* to be inform.  Unfortunately adding together 2 uniform distributions
* will make another uniform distribution
gen rv1=runiform()
gen rv2=runiform()
gen rv3=runiform()

* The correlation between the variables is
gl rho12 = .5

gen v1=(rv1)*5.3
gen v2=(rv1*\$rho12 + rv2*(1-\$rho12)^.5)*.3
gen v3=(rv1*\$rho12 + rv3*(1-\$rho12)^.5)*.3

corr v1 v2
* Not quite perfect but a good approximation

* Now let's generate consumption level
gen y = W/Py *b/(a+b+c)   + v1
label var y "Food"
gen x = W/Px *a/(a+b+c)   + v2
label var x "Entertainment"
gen z = W/Pz *c/(a+b+c)   + v3
label var z "Unobserved Remitances"

* Adjust wealth to reflect true wealth
replace W = y*Py + x*Px + z*Pz

sum x y

************************************************************
* simulation END
************************************************************

* One of the difficulties (and strengths) in estimating from simulations is that
* you don't always know when an estimator is performing correctly.

* For instance if we want to estimate demand response to change in price
* dy/dPy then due to the non-linear nature of real demand functions
* the most straightforward regressions are going to give us some funky estimates.

* For example:
reg y W Py
local dydW=_b[W]
local dydPy=_b[Py]
predict y_hat
label var y_hat "OLS fitted values"

* Everything looks good but what does it mean?
* The true effect of dy/dW is
* dy/dW=1/Py *b/(a+b+c)
sum Py

gen b_abc=b/((a+b+c)*r(mean))
sum b_abc

di "So and estimate of `dydW' is looking pretty close to " r(mean)

* This is expected because y is linear in W.

* {remember y = W*b/(Py*(a+b+c)) = (Py^-1) * W * b/(a+b+c) }
* However, check out what happens when you take dy/dPy
* dy/dPy = (-1)*(Py^-2) * W * b/(a+b+c)

gen dydPy_true = (-1)*(Py^-2) * W * b/(a+b+c)
sum dydPy_true
di "The estimate of `dydPy' is looking pretty close to " r(mean) " as well"

* We can see that the OLS estimate is biased upwards because E(d g(y)/dPy)) > g(E(dy/dPy))
* when g is convex (Jensen's inequality).  However, the degree of the bias is not
* so large as to make us worry.

* An alternative approach which is rather appealing is taking the log of everything
* ln(y) = By1 ln(W) - By2 ln(Py) + ln(b/(a+b+c))
* ln(x) = Bx1 ln(W) - Bx2 ln(Px) + ln(a/(a+b+c))

gen lny=ln(y)
gen lnx=ln(x)
gen lnW=ln(W)
gen lnPy=ln(Py)
gen lnPx=ln(Px)

* This really not so useful with the Cobb-Douglas.  However, it does give us a method of checking
* if we think our demand function is Cobb-Douglas:

reg lny lnW lnPy
* We might attempt to use the constants to get a grasp of the consumer parameters
* ln(b/(a+b+c)) = `b_abc', b/(a+b+c) = exp(`b_abc')
di exp(_b[_cons])
sum b_abc
predict lny_hat

gen y_ln_exp_hat = exp(lny_hat)
label var y_ln_exp_hat "OLS fitted values form Log regression"

two (line y Py, sort) (line y_hat Py, sort color(blue) ) (line y_ln_exp_hat Py, sort color(red))

reg lnx lnW lnPx
local a_abc=_b[_cons]
predict lnx_hat
gen x_ln_exp_hat = exp(lnx_hat)

* If there was no bias in the estimates then we expect By1 = By2 = Bx1 = Bx2 =1
* But the ln function is concave therefore our estimates will be downward biased by
* Jensen's inequality.  In this case all of the estimates seem to be biased by at least 50%

*******************************************************************************************
* So in practice what do people do?

* A common model is the AIDS (Almost Ideal Demand System) to estimate share of demand as
* a function of price and expenditure
* Lets see how it performs at estimating the expenditure function (assume we know z and Pz)

* First LAI
* generate share values
gen sy = Py*y/W
label var sy "Share of expenditures on y"
gen sx = Px*x/W
gen sz = Pz*z/W

sum s*

gen lnPz = ln(Pz)

* Then generate lnP, the price index (P)
gen lnP = sx*lnPx + sy*lnPy + sz*lnPz

* Not let's generate the Beta term
gen lnW_P = ln(W)-lnP

* So to estimate AIDS, which is share as a function of price
reg sx lnPx lnPy lnPz lnW_P
predict sx_hat

reg sy lnPx lnPy lnPz lnW_P
predict sy_hat
label var sy_hat "y hat from AIDS"

reg sz lnPx lnPy lnPz lnW_P
predict sz_hat

two (line sy Py, sort) (line sy_hat Py, sort)

* It seems the AIDS model fits pretty well.