Sunday, April 29, 2012
Seemingly Unrelated Regression - with Correlated Errors
* Stata Simulation and Estimation
* This code simulates a systems OLS: Seemingly Unrelated Regression (SUR) with Panel data
* on three different dependent variables each with different explanatory variables.
* In this case we will image that we want to predict the domestic price of
* wheat, corn, soybeans from the world price of wheat, corn, soybeans in Moldova
* This simple analysis would not be as easy if you wanted to use a large country
* since the price of these products domestically could cause changes in world prices.
* Imagine a 3 equation SUR system with:
* y1 = x1B1 + u1
* y2 = x2B2 + u2
* y3 = x3B3 + u3
* The key in SUR is to assume that E(y1|x1,x2,x2)=E(y1|x1)=x1B1 or in other words that it
* does not create bias in estimating y1 if you were to leave out x2 and x3. In the example
* this argument explicitly states that you do not need to know the world price of corn to
* predict the domestic price of wheat so long as you are already using the world price of
* wheat in your alanysis.
* Clear the memory
clear
set seed 101
* Image that we have data for 20 years starting in 1985
set obs 20
gen year=1984+_n
* First let's image that over the years there are random price shocks that hit all three domestic
* products such as drought, heat waves, early frosts. These are captured by the error term v.
gen v=4*rnormal()
* Let's imagine also that the expected price of wheat is 4, corn 3, and soybeans 3.5
* First we generate the shocks to different prices that are not shared plus the shocks that
* are shared (v).
gen u_wheat=rnormal()+v
gen u_corn=rnormal()+v
gen u_soy=rnormal()+v
* Then we generate world prices for year 1985
gen wheat_world=abs(4 + 2*rnormal()) if year ==1985
gen corn_world =abs(3 + 2*rnormal()) if year ==1985
gen soy_world=abs(3.5 + 2*rnormal()) if year ==1985
* Let's assume that world prices are AR1. That is, last year's price is positively correlated
* with this year's price but not next year's price.
replace wheat_world=abs(4/2 + 2*rnormal() + wheat_world[_n-1]/2) if year !=1985
replace corn_world =abs(3/2 + 2*rnormal() + corn_world[_n-1]/2) if year !=1985
replace soy_world=abs(3.5/2 + 2*rnormal() + soy_world[_n-1]/2) if year !=1985
* World prices are also autoregressive to the first degree AR1. That is controlling for
* previous period prices perfectly predicts current prices.
* The abs function ensures that world prices never become negative.
* In addition, let's say there was a common trend that caused world prices to rise graduatelly
* near the center of our data before falling again.
* We want the year 1 and year 20 to have no trend. Let's let the trend be a second degree polynomial.
* so t[year]=alpha1*year + beta*year^2. we want t[1]=0, t[10]=5, t[20]=0
gen t = 0 if year == 2004 | year == 1985
replace t = 5 if year == 1995
gen year2=year^2
reg t year year2
predict trend
drop t
* Now lets add the trend back to world prices
replace wheat_world=wheat_world+trend
replace corn_world =corn_world+trend
replace soy_world=soy_world+trend
* Let's see what the world price look like
twoway (line wheat_world year) (line corn_world year) (line soy_world year)
* Now, let us specify how we want world prices to influence domestic prices.
* In an free global market world prices should be perfectly predictive of local prices.
* So let us say the coefficients on world prices in predicting domestic prices is 1.5.
* To put everything together:
gen wheat_moldova = .75*wheat_world + abs(u_wheat + 2)
gen corn_moldova = .75*corn_world + abs(u_corn + 2)
gen soy_moldova = .75*soy_world + abs(u_soy +2)
* Let's see what the world price look like
twoway (line wheat_world year) (line wheat_moldova year) ///
(line corn_world year) (line corn_moldova year) ///
(line soy_world year) (line soy_moldova year)
* We could do three different regressions:
reg wheat_moldova wheat_world
reg corn_moldova corn_world
reg soy_moldova soy_world
* Which gives us unbiased and consistent estimates. However, we could also do
* SUR which should allow for correlation between the errors (v). Thus
sureg (wheat_moldova wheat_world) (corn_moldova corn_world) (soy_moldova soy_world)
* Since we think (know) that the coefficients on the different world prices are the same
* we could specify a constraint to force them to be the same accross equations.
constraint 1 [wheat_moldova]wheat_world=[corn_moldova]corn_world
constraint 2 [wheat_moldova]wheat_world=[soy_moldova]soy_world
sureg (wheat_moldova wheat_world) (corn_moldova corn_world) (soy_moldova soy_world) , const(1 2 )
* To test the relative merits of SUR next to OLS we will use Monte Carlo simulation.
* Same code as previously but now a program needs be defined. With return statements.
cap program drop sur_test
program define sur_test, rclass
clear
* set seed 101 - must turn this off or else every repetition will be exactly the same
set obs 20
gen year=1984+_n
gen v=4*rnormal()
gen u_wheat=rnormal()+v
gen u_corn=rnormal()+v
gen u_soy=rnormal()+v
gen wheat_world=abs(4 + 2*rnormal()) if year ==1985
gen corn_world =abs(3 + 2*rnormal()) if year ==1985
gen soy_world=abs(3.5 + 2*rnormal()) if year ==1985
replace wheat_world=abs(4/2 + 2*rnormal() + wheat_world[_n-1]/2) if year !=1985
replace corn_world =abs(3/2 + 2*rnormal() + corn_world[_n-1]/2) if year !=1985
replace soy_world=abs(3.5/2 + 2*rnormal() + soy_world[_n-1]/2) if year !=1985
gen t = 0 if year == 2004 | year == 1985
replace t = 5 if year == 1995
gen year2=year^2
reg t year year2
predict trend
drop t
replace wheat_world=wheat_world+trend
replace corn_world =corn_world+trend
replace soy_world=soy_world+trend
gen wheat_moldova = .75*wheat_world + abs(u_wheat + 2)
gen corn_moldova = .75*corn_world + abs(u_corn + 2)
gen soy_moldova = .75*soy_world + abs(u_soy +2)
* Same code as before but without the comments up to this point
reg wheat_moldova wheat_world
* Return, returns a scalar
return scalar wheat_ols=_b[wheat_world]
reg corn_moldova corn_world
return scalar corn_ols=_b[corn_world]
reg soy_moldova soy_world
return scalar soy_ols=_b[soy_world]
sureg (wheat_moldova wheat_world) (corn_moldova corn_world) (soy_moldova soy_world)
* This retrieves the cofficients and sends them out as returns. There must be a better
* way of doing this.
matrix A= e(b)'
svmat A, names(sureg)
cap return scalar wheat_sureg=sureg[1]
cap return scalar corn_sureg=sureg[3]
cap return scalar soy_sureg=sureg[5]
* Since we think (know) that the coefficients on the different world prices are the same
* we could specify a constraint to force them to be the same accross equations.
constraint 1 [wheat_moldova]wheat_world=[corn_moldova]corn_world
constraint 2 [wheat_moldova]wheat_world=[soy_moldova]soy_world
sureg (wheat_moldova wheat_world) (corn_moldova corn_world) (soy_moldova soy_world) , const(1 2 )
matrix A2=e(b)'
svmat A2, names(sureg2)
cap return scalar wheat_sureg2=sureg2[1]
cap return scalar corn_sureg2=sureg2[3]
cap return scalar soy_sureg2=sureg2[5]
end
* End of the program
sur_test
simulate wheat_ols=r(wheat_ols) wheat_sureg=r(wheat_sureg) wheat_sureg2=r(wheat_sureg2) ///
corn_ols=r(corn_ols) corn_sureg=r(corn_sureg) corn_sureg2=r(corn_sureg2) ///
soy_ols=r(soy_ols) soy_sureg=r(soy_sureg) soy_sureg2=r(soy_sureg2), rep(100): sur_test
sum
* From the simulation we can see that SUR outperforms ols and SUR2 outperforms sureg.
* This is visible generally by how much closer the average estimate of the cofficients are
* on the true value which is .75 In addition, you can see that SUR and SUR2 are more
* efficient because they have smaller standard errors.
Subscribe to:
Post Comments (Atom)
Hi,
ReplyDeleteI don't think this code works with panel data. I think it is for cross section data. Can you verify that it works with panel data?
I believe the example is using time series data.
Delete