## Sunday, May 6, 2012

### Misspecified Cobb Douglas Demand Model

* Test our ability to estimate consumer preferences
* Stata simulation and estimation

* v=max{x,y}{x^a * y^b * z^c} s.t. (W=xPx + yPy + zPz)
* since ln is monotonic and increasing this is the same as:
* max{x,y}{a*ln(x)+b*ln(y)+c*ln(z)} s.t. (W=xPx + yPy + zPz)

* L=a*ln(x)+b*ln(y)+c*ln(y) M(W>=xPx + yPy + zPz)
* Lx:=   a/x + M*Px=0
* Ly:=   b/y + M*Py=0
* Lz:=   c/z + M*Pz=0
* LM:=  W-(xPx + yPy + zPz)=0

* a/xPx             = -M
* b/yPy             = -M
* c/zPz             = -M
* a/xPx             = b/yPy
* xPx/a             = yPy/b
* xPx               = ayPy/b
* zPz               = cyPy/b
* W-(ayPy/b + yPy + cyPy/b) = 0
* W-(a/b + 1 + c/b)yPy = 0
* W-(a/b + b/b + c/b)yPy = 0
* W = ((a+b+c)/b)yPy
* y = W*b/(Py*(a+b+c))
* x = W*a/(Px*(a+b+c))
* z = W*c/(Pz*(a+b+c))

* y                 = W/(Py(a/b + 1))
* because we know the solutions are symetic
* x                 = W/(Px(b/a + 1))

clear

* so Let's start the simulation
* Say we have a cross section with 1000 cities
set seed 101
set obs 1000

* first let us set the parameters a and b, these are not observable
gen a = .4
gen b = .5
gen c = .1

* each city faces a different price.  the .5 is so that no city
* gets a price too close to zero.
gen Px=runiform()+.75
gen Py=runiform()+.5
gen Pz=runiform()+.2

* each city has a different level of wealth
gen W=runiform()*5+1

* Lets generate the errors in X.  We want the errors to be correlated
* which can be gotten at by a little trick but I also wanted the errors
* to be inform.  Unfortunately adding together 2 uniform distributions
* will make another uniform distribution
gen rv1=runiform()
gen rv2=runiform()
gen rv3=runiform()

* The correlation between the variables is
gl rho12 = .5

gen v1=rv1
gen v2=rv1*\$rho12 + rv2*(1-\$rho12)^.5
gen v3=rv1*\$rho12 + rv3*(1-\$rho12)^.5

corr v1 v2
* Not quite perfect but a good approximation

* Now let's generate consumption level
gen y = W/Py *b/(a+b+c)   + v1*5.3
label var y "Food"
gen x = W/Px *a/(a+b+c)   + v2*5.3
label var x "Entertainment"
gen z = W/Pz *c/(a+b+c)   + v3*5.3
label var z "Unobserved Remitances"

gen What = y*Py + x*Px + z*Pz

reg W What
* check

sum x y

************************************************************
* simulation END
************************************************************
* Estimation begin
* now we want to recover the underlying a and b and c coefficients
* but there is one twist.  We cannot observe Z so we will assume
* seperability.

* So let's try this again:

nl (y = (W/Py)/({gamma_hat} + 1) + {alpha})
* save the cofficient
local gamma_hat = _b[/gamma_hat]

nl (x = (W/Px)/({gamma_inv_hat} + 1)  + {alpha})
local gamma_inv_hat = _b[/gamma_inv_hat]

local gamma_inv_hat_inv =1/`gamma_inv_hat'
di "Does the inverse of gamma_inv_hat_inv = gamma_hat? (ie`gamma_inv_hat_inv'=`gamma_hat'?)"

cap program drop gamma_inv_hat
program define gamma_inv_hat, rclass
nl (y = (W/Py)/({gamma_hat} + 1)+ {alpha})
return scalar gamma_hat = _b[/gamma_hat]

nl (x = (W/Px)/({gamma_inv_hat} + 1)+ {alpha})
return scalar gammma_inv_inv = 1/_b[/gamma_inv_hat]
end

gamma_inv_hat
return list

bs gamma_hat=r(gamma_hat) gammma_inv_inv=r(gammma_inv_inv), reps(50): gamma_inv_hat

local se_gamma_hat      = _se[gamma_hat]
local se_gammma_inv_inv = _se[gammma_inv_inv]

* if we assume that gamma_hat and gammma_inv_inv are now uncorrelated
* se(gamma_hat + gammma_inv_inv) = (se(gamma_hat)^2 + se(gammma_inv_inv)^2)^.5
* this is the most conservative assumption when you are trying to test if
* two estimators are different.  But because we are trying to test that they are the
* same this is the most generous assumption

local test_stat = (`gamma_inv_hat_inv'-`gamma_hat')/(`se_gamma_hat'^2 + `se_gammma_inv_inv'^2)^.5

di "so the total se() is se_gamma_hat + se_gammma_inv_inv =" ///
_newline "                `se_gamma_hat' + `se_gammma_inv_inv' =  `se_tot'" ///
_newline "and gamma_inv_hat - gamma_hat = " `=1/`gamma_inv_hat'' - `gamma_hat' ///
_newline "And we want to test the statistical magnitude of:" ///
_newline  "   (gamma_inv_hat_inv - gamma_hat)/(se_gamma_hat^2 + se_gammma_inv_inv^2)^.5" ///
_newline  "   =(`gamma_inv_hat_inv'-`gamma_hat')/(`se_gamma_hat'^2 + `se_gammma_inv_inv'^2)^.5" ///
_newline  "   = `test_stat'"

* I am not entirely sure how many degrees of freedom to use with the resulting test
* We are testing 1 restriction with, _N - 2 degrees of freedom.

local fstat=F(1,_N - 2,(`test_stat')^2)
di "If I am doing this right then the probability of seeing this results based on the f" ///
_newline "  distribution is equal to 1-F(1,_N - 2,(`test_stat')^2) =" 1-`fstat'

di "We reject the null very strongly.  Which we ought to."

* So, it is clear in the nighly structural modelling case that getting the structural model correct
* is important

* But even in this case experiment with what happens when c gets sufficiently small. In the case where
* c is sufficiently small, not including it does not bias the restults.

*******************************************************************************************
* Let's instead assume that there is a good z which we know about but cannot directly observe
* expenditure on, however we can infer expenditure.  Since in our model W=xPx + yPy + zPz.
* So remember

* 1) y = W/Py * b/(a+b+c)
* 2) x = W/Px * a/(a+b+c)
* 3) z = W/Pz * c/(a+b+c)
* 4) W = xPx + yPy + zPz

* let's try to estimate directly:
nl (y = W/Py * {b}/({a}+{b}+{c}))
nl (x = W/Px * {a}/({a}+{b}+{c}))
* That did not work!

* Can we use the information we already have to estimate c indirectly?

* zPz = W * c/(a+b+c)
* We know W and xPx and yPy:
* W - xPx - yPy = zPz = W * c/(a+b+c)
* (W - xPx - yPy) = c/(a+b+c)
* (W - xPx - yPy)(a+b+c) = c
* (W - xPx - yPy)(a+b) = c - (W - xPx - yPy)c
* (W - xPx - yPy)(a+b) = (1 - W + xPx + yPy)c
* c = (W - xPx - yPy)(a+b)/(1 - W + xPx + yPy)

* So let's substitute c into nl (y = W/Py * {b}/({a}+{b}+{c}))
nl (y = W/Py * {b}/({a}+{b}+(W - x*Px - y*Py)*({a}+{b})/(1 - W + x*Px + y*Py)))

* Unfortunately the NL command is just not capable of finding solutions to all problems
* due to its underlying maximization algorithms.