Friday, June 1, 2012

Spatial malaria epidemiology infection model - simulation

* Spatial Malaria epidemiology infection model - simulation

* This code creates a simulation similar to that of:
* International Journal for Parasitology's article

* "Spatial simulation of malaria transmission and its control by malaria transmission blocking vaccination"

* by Richard Carter

* I am not sure exactly how well the simulation follow's Carter's simulation because I do not understand how the disease vector works exactly.

* I think his simulation does not have individual mosquitos as mine does but rather relies on "disease vectors".

* Something I do not fully understand.

* Much of the techniques used in this simulation are employed in the previous simulation:

* Part 1: Set the simulation parameters

* The following locals set the parameters of the simulation.
* Number of people to be generated
local num_people = 1000

* The gross number of mosquitos to be generated (this could be thought of as a group of mosquitos)
local num_mos = 10000

* The actual amount of mosquitos generated is a fraction of this number relative to how many mosquitos end up in the mosquito ecology.
* If the radius isbbbbb no greater than .5 the then expected number of mosquitos generated will be (pi*r^2)*num_mos = 78.5%*num_mos|r=.5
* The radius of the ecology
local r=.5

* How fast do the mosquitos move change_x = x+mos_speed*rnormal()
local mos_speed=.01

* Specify a breeding rate for mosquitos outside of the ecological zone.
* Close to zero 0 means that almost no mosquitos can breed outside of the ecological zone.
* Warning
* A 1 means there is no difference between the ecological zone and the non-ecological.
local mos_breed=0.1

* Specify the life expectancy for mosquitos in terms of rounds.
* At round `mos_life' all of the mosquitos die and round `mos_life'+1 new mosquitos are born which are now all free of infections.
local mos_life=25

* Initial infection rate within the human population
local infect_rate=.05

* How detailed is the grid (0,1) in which the humans and mosquitos inhabit.
* A highly detailed grid will make encounters between humans and mosquitos less likely.
local grid=.01

* How many rounds of the simulation are going to be modelled
local num_rounds=500

* This tells the simulation to keep information for the last keep_round number of rounds.
* This number must be larger than the recover_rate and the immune_rate
local keep_round=21

* Display initial
local graphs = 1
* Part 2: Generate the population of people

* First lets clear the memory

* Next lets set the number of observations
set obs `num_people'

* We will first generate our people
gen pid = _n
  label var pid "Person ID"

* Generate the locations of the people.
* We can think of this as the locations that the people sleep in.
gen x = round(runiform()-.5,`grid')
  label var x "Human x position"
gen y = round(runiform()-.5,`grid')
  label var y "Human y position"

* For simplicity two people cannot occupy the same location
duplicates drop x y , force

* Imagine a mesquito ecology which is a circle of r radius with a center at zero  x^2 + y^2 <.3:
*1.  x<(r-y^2)^.5
*2.  x>-(r-y^2)^.5
*3.  y<(r-x^2)^.5
*4.  y>-(r-x^2)^.5

* This indicates if the person lives within the mosquito ecology or not.
tempvar inC
gen `inC'=0
replace `inC' = 1 if x<(`r'^2-y^2)^.5 & x>-(`r'^2-y^2)^.5 & y<(`r'^2-x^2)^.5 & y>-(`r'^2-x^2)^.5
if `graphs'==1 two (scatter y x if !`inC', color(red)) ///
 (scatter y x if `inC', color(blue)) , title(Malaria Ecology)

gen infected = rbinomial(1,`infect_rate')
  label var infected "Person infected with Malaria"

if `graphs'==1 two (scatter y x)  ///
  (scatter y x if infected, color(green)), ///
  title(Infected Population)

save "people.dta", replace
* End People part of simulation

* Let's trim down the data set to be only those relevant for the merge
  keep x y infected
  save "people_infected.dta", replace

* Part 3: Generate the population of mosquitos

* Now let us imagine a population of `num_mos' mosquitos that could have been hypathetically born anywhere but if they are not born in the mosquito econology they do not exist.

set obs `num_mos'

gen x = round(runiform()-.5,`grid')
  label var x "Mosquito Position x"
gen y = round(runiform()-.5,`grid')
  label var y "Mosquito Position y"

* Generate an indicator if a mosquito can be bred outside of the ecological zone
  tempvar mos_exist
  gen `mos_exist' = rbinomial(1,`mos_breed')

* Lets first remove the mosquitos which are not born in the ecology and got a bad draw of mos_breed:
  drop if !(x<(`r'^2-y^2)^.5 & x>-(`r'^2-y^2)^.5 & y<(`r'^2-x^2)^.5 & y>-(`r'^2-x^2)^.5)  & `mos_exist'==0

  gen infect=0

save "mos.dta", replace

* Part 4: Do round 1 of infecting the mosquitos with malaria.

* When thinking about Malaria we should be thinking not so much about Mosquitos being the problem but Mosquitos infected with Malaria being the problem.

* The assumption of this simulation will be that any mosquito that lands in the same area (defined by the grid) as a person with Malaria will become infected.

* Likewise, after that, any person bitten by an infected Mosquito will become infected.

* Still using the mosquito data.

* Let us first merge in the people data:

merge m:1 x y using "people_infected.dta"
  replace infected = 0 if infected == .

* We only want to keep data from master (mosquitos) or that matches with mosquitos but not person data:
  drop if _merge==2
  drop _merge

* Now we can see first how many mosquitos are infected.
if `graphs'==1 twoway (scatter y x if !infected)   ///
  (scatter y x if infected, color(red))    ///
  , legend(label(1 "Infected") ///
  label (2 "Not Infected"))    ///
  title(Mosquitos Initially Infected with Malaria)

gl round=1
* Generate an indicator of what round we are in.

* Initially there is not a lot of mosquitos infected because they have just been born and they only become infected by encountering infected humans.

gen x_mos_r$round = x
  label var x_mos_r$round "Mosquito round $round x position"
gen y_mos_r$round = y
  label var y_mos_r$round "Mosquito round $round y position"

* Next mosquitos move randomly from there current position
  replace x = round(x + rnormal()*`mos_speed',`grid')
  replace y = round(y + rnormal()*`mos_speed',`grid')

rename infected mos_infected_r$round
  label var mos_infected_r$round "Mosquito infected round $round"

* Now lets first see if any more mosquitos are infected by encountering people
merge m:1 x y using "people_infected.dta"
  drop if _merge==2
  drop _merge

replace infected = 1 if mos_infected_r$round == 1
  replace infected = 0 if infected == .

* Create a temporary variable to indicate if the mosquitos are within the specified graphing area.
tempvar graph_area
gen `graph_area' = 1 if x>=-.5 & x<=.5 & y>=-.5 & y<=.5

if `graphs'==1  twoway (scatter y x if !infected & `graph_area' == 1 )   ///
  (scatter y x if infected & `graph_area' == 1 , color(red))    ///
  (scatter y x if mos_infected_r1 & `graph_area' == 1 , mcolor(cranberry))    ///
  , legend(label(1 "Not Infected") ///
  label (2 "Infected round 1")          ///
  label (3 "Infected round 2"))    ///
  title(Mosquitos Infected with Malaria)

sum *infected*

* Save the infection rates of the population
  qui sum mos_infected_r1
  gen mos_infection_rate = r(mean) if _n==1

save "mos.dta", replace

* Let's trim down the data set to be only those relevant for later merges
  keep x y infected
  save "mos_infected.dta", replace

* Part 5: Let's infect the first group of people with malaria from mosquitos.

* Now we see if any more people are infected by encountering mosquitos
use "people.dta", clear

* Save data on the infected humans from the current round
rename infected human_infected_r$round
  label var human_infected_r$round "Human infected with Malaria Round $round"

* Merge in the mosquito data
merge 1:m x y using "mos_infected.dta"
  drop if _merge==2
  drop _merge

* This should eliminate any duplicate humans created by the 1:m merge
* We can think of this as some humans get bit by 2 mosquitos, if one of them is infected then the human is now infected, thus max.
collapse (max) x y infected human_infected_*, by(pid)

replace infected = 1 if human_infected_r$round == 1
  replace infected = 0 if infected == .

if `graphs'==1 twoway (scatter y x if !infected)   ///
  (scatter y x if infected, color(pink))    ///
  (scatter y x if human_infected_r1, mcolor(cranberry))    ///
  , legend(label(1 "Not Infected") ///
  label (2 "Infected round 2")          ///
  label (3 "Infected round 1"))    ///
* Not too many additional people get infected with Malaria as a result of the first round

* Save the infection rates of the population
  qui sum human_infected_r1
  gen obs_infection_rate = r(mean) if _n==1

save "people.dta", replace

* Trim down the data set to be only those relevant for the next merge
  keep x y infected
  save "people_infected.dta", replace

* This is just the setup to the full simulatiom.
* If you are interested in getting the full simulation contact me with a description of the project you are currently working on and I will check with my team.