Monday, June 18, 2012

Demographic- Mortality Rates Simulation


* The following simulation generates mortality rates and typical demographic type data.

* NOTE:: The values in this simulation does not flect true demographic information but only meant to be sketch of how someone might simulate true demographic information.
set seed 1010

* Empty the memory
clear

* Set the number of observations to generate
set obs 2000

* Specifies gender
gen female=rbinomial(1,.5)

* Specifies if person smoked
gen smoked=rbinomial(1,.25)

* Specifies if person excercised
gen exercise = runiform()*.5

* Specifies personal heterogeneity
gen genetics = abs(rnormal(1,.5))+.5

* Now let us generate initial rates:

* All of our people start at age 1
gen age=1
* None of them are initially deceased
gen deceased=0

gen mortality_age=.
  label var mortality_age "Age at death"

gen id=_n
  label var id "Personal ID"

gen cancer_risk = 0.005

gen death_prob = 0

gen age_risk = .

* Generates birth year
gen brth_year = round(runiform()*150)+1850

* Specifies the year of observation in the simulation
gen year = brth_year + age - 1

* Specifies the starting age
local age=0

* Is an indicator that when positive tells the while loop to end.
gl end=0

* Begin the mortality loop
qui while $end==0 {

  * Age is the age that the person will be in at the end of the current loop
  local age=`age'+1

  * Generates the current year
  replace year = brth_year + age  - 1

  * Generate the probability of death by cancer if the person smoked
  replace cancer_risk = `age'^.5*.001 if smoked == 1 & age > 18 &  age == `age'

  * Generate a general aging mortality factor
  replace age_risk = `age'^1.9-`age'^1.89

  * Create the combined death probability
  replace death_prob = .001*genetics*age_risk + cancer_risk - exercise*.1 if deceased == 0

  replace death_prob = abs(rnormal())/(200*age^2) if age < 10 & deceased == 0 &  age == `age'
  * High mortality rate in first year

  * Men tend to have higher mortality rates at all ages
  replace death_prob = death_prob+.01 if female==0 &  age == `age'

  * The death rate for males increases during World War I and WWII
  replace death_prob = death_prob + .03 if age >17 & female==0 & deceased == 0 & year > 1913 & year < 1919 &  age == `age'
  replace death_prob = death_prob + .05 if age >17 & female==0 & deceased == 0 & year > 1939 & year < 1946 &  age == `age'

  * Make sure there is a minimum death rate of 1 out of 1000
  * replace death_prob = max(death_prob,.001) if  deceased == 0 & age == `age'

  * Save the true average probability of dieing for that age group
  sum death_prob if deceased == 0 &  age == `age'
  local death_prob = r(mean)

  * Generate a variable to indicate if this person died at this age
  gen yr_o_death = rbinomial(1, death_prob) if deceased == 0 & age == `age'

  * Replace the person's life living indicator if that person is deceased
  replace deceased = 1 if yr_o_death == 1 & age == `age'
  replace mortality_age = `age' if yr_o_death == 1 & age == `age'

  expand 2 if deceased == 0 & age==`age', gen(new_yr)

  replace age=age+1 if new_yr==1 & age==`age'

  sum deceased if age==`age'
  drop new_yr yr_o_death

  if r(N)==0 gl end=1

  noi di "Age: `age' mortality rate " r(mean) ": survivors : " r(N) " : true mean death prob =  `death_prob'"

}

  * Dummy variables to indicate if the person was alive
  gen WW1=0
  gen WW2=0
  replace  WW1 = 1 if female==0 & year > 1913 & year < 1919 & age >17
  replace  WW2 = 1 if female==0 & year > 1939 & year < 1946 & age >17

  * Create dummies for if the person was male, lived through the years
  bysort id: egen WW1dum=max(WW1)
  bysort id: egen WW2dum=max(WW2)

* This will graph a histogram of mortality by age
qui sum mortality_age
local bin_max = r(max)
hist mortality_age, kden bin(`bin_max') percent
sum mortality_age

* We can see that the simulation is producing the right signs for the independent variables.
reg mortality_age female exercise smoked WW?

* The following will look at the distribution of ages at a particular time (1945)
local yr=1945
qui sum age if year == `yr'
local bin_max = r(max)
di "hist age if year == `yr', kden bin(`bin_max') percent"
hist age if year == `yr', kden bin(`bin_max') percent title(Population Distribution in `yr')
sum mortality_age if year == `yr'

*
qui sum mortality_age
local bin_max = r(max)
hist mortality_age, kden bin(`bin_max') percent by(age)
sum mortality_age

No comments:

Post a Comment