Been a little light on the postings recently. Here is why:
Thursday, August 21, 2014
Tuesday, August 5, 2014
Stata: Generate a Spatial Moving Average
Often times we may be interested in generating a spatial moving average of a characteristic X. We
may use this moving average to help control for heterogeneity in the population which may be related to the spatial distribution of observations. In order to do that we need to have a method of generating a spatial mean.
I code this manually because I do not have experience with spatial data in Stata and do not know what the built in command is (assuming there is one). If you are just looking for the spatial mean then you may favor the built in command. However, this method is flexible and easily modifiable if for instance you would like to use measures beyond the Euclidean 2D distance formula and would instead prefer the 3D formula or nD formula really. Likewise moving average statistic might easily be replaced by moving variance or any other statistic that could be generated via the egen command. Thus this exercise might be useful to examine even if redundant.
global Nobs = 1000
clear
set obs $Nobs
* Generate 2D coordinates
gen latt = runiform()*100
gen longg = runiform()*100
* Generate the variable of interest. The variable will
* have a random component and a spatially dependent
* component.
gen X = (latt+longg)/100+rnormal()
two (scatter latt X) (scatter longg X)
* We can see that though there is a general trend to larger values as longitude or latitude increase it is hard to identify any strong pattern.
* Now let's calculate the moving average of X for each
* observation. (There is probably a command for this
* which I do not know).
global meanrange=30
gen Xave = .
gen dist = .
forv i=1/$Nobs {
* Calculate the distance of all points from obs i
replace dist = ((latt-latt[`i'])^2+(longg-longg[`i'])^2)^.5
* Calculate the mean of X if distance is within the range of interest
egen tempx = mean(X) if dist<$meanrange
replace Xave = tempx if _n==`i'
drop tempx
}
drop dist
two (scatter latt Xave) (scatter longg Xave)
* Now, looking at the moving average we can easily visually identify the effect of location on the expected value of X.
may use this moving average to help control for heterogeneity in the population which may be related to the spatial distribution of observations. In order to do that we need to have a method of generating a spatial mean.
I code this manually because I do not have experience with spatial data in Stata and do not know what the built in command is (assuming there is one). If you are just looking for the spatial mean then you may favor the built in command. However, this method is flexible and easily modifiable if for instance you would like to use measures beyond the Euclidean 2D distance formula and would instead prefer the 3D formula or nD formula really. Likewise moving average statistic might easily be replaced by moving variance or any other statistic that could be generated via the egen command. Thus this exercise might be useful to examine even if redundant.
global Nobs = 1000
clear
set obs $Nobs
* Generate 2D coordinates
gen latt = runiform()*100
gen longg = runiform()*100
* Generate the variable of interest. The variable will
* have a random component and a spatially dependent
* component.
gen X = (latt+longg)/100+rnormal()
two (scatter latt X) (scatter longg X)
* We can see that though there is a general trend to larger values as longitude or latitude increase it is hard to identify any strong pattern.
* Now let's calculate the moving average of X for each
* observation. (There is probably a command for this
* which I do not know).
global meanrange=30
gen Xave = .
gen dist = .
forv i=1/$Nobs {
* Calculate the distance of all points from obs i
replace dist = ((latt-latt[`i'])^2+(longg-longg[`i'])^2)^.5
* Calculate the mean of X if distance is within the range of interest
egen tempx = mean(X) if dist<$meanrange
replace Xave = tempx if _n==`i'
drop tempx
}
drop dist
two (scatter latt Xave) (scatter longg Xave)
* Now, looking at the moving average we can easily visually identify the effect of location on the expected value of X.
Subscribe to:
Posts (Atom)