Wednesday, June 13, 2012

Convert factor variables -> Dummy lists


* Convert factor variables -> Dummy lists

* First let's load up a data set with factor variables
sysuse auto, clear

tab headroom
* We can see that headroom is a factor variable in that it has a handful of factors (while not coded as a factor variable)

* Let us first recode it as a string
tostring headroom, replace

* Now to make it a factor variable
encode headroom, gen(head)

drop headroom

****** Now we have this 'ideal' data set with a factor variable that we would like to create a list of dummy variables for each year and prices just for that factor.

decode head, gen(headroom)
  * Headroom is now a string variable

* Create two empty globals
global year_list
global year_price_list

* Indicate where the variable year_list starts
gen YEAR_LIST_START=.

* The next two lines of code sets up a loop to loop through all of the factors of the crop_string variable.
     qui levelsof headroom, local(levels)
* This saves the factors of headroom as a local called levels

     foreach l of local levels {
 
* Sets up the factors so that it is a reasonable variable name
local new_var_name = strtoname("`l'")
* This will make invalid characters like spaces and points into underscores.

* Creates a dummy variable if year string equal to the factor variable
        cap gen Yr`new_var_name' = 1 if headroom=="`l'"
label var Yr`new_var_name' "headroom: `l'"

* Adds a new entry to the list for the dummy variable created
global year_list $year_list Yr`new_var_name'
   
   * Adds a new string
   qui gen `=strtoname("price`new_var_name'")' = price if headroom=="`l'"
label var `=strtoname("price`new_var_name'")' "Price: `l'"

global year_price_list $year_price_list `=strtoname("price`new_var_name'")'

}

* This marks an end to the lists of dummies and prices created
gen YEAR_LIST_END=.

* We can see lists of the variables created with:

di "$year_list"

di "$year_price_list"

* IF you don't like that off years are empties you could recode them as 0s

recode YEAR_LIST_START-YEAR_LIST_END (.=0)

* But be careful about this because prices are now thrown off

sum price_*

No comments:

Post a Comment