## Saturday, July 21, 2012

### A note on Temporary Variables in Stata

* It is easy to create temporary variables in Stata that are automatically cleaned from memory as soon as the current do file is completed or program is done.

clear
set obs 10000

* For example
tempvar temp1 temp2 temp3

gen temp1' = rnormal()
gen temp2' = rnormal()^2
gen temp3' = runiform()

sum
* Conviently these variables are cleaned from memory once no longer useful

* However, if we were to try to override a temporary variable with one of the same name then it will not work.

forv i = 1/10 {
tempvar temp
gen temp' = rnormal()
}

sum

* However, it is impossible to target previous temporary variables that were created with the typical temp' identifier since now it is targetting the most recently created variable.

sum __000003

* However works.

* Which is really not very useful but could potentially be.

* This may not appear to be a problem but when you start looping through commands could potentially result in a large number of temporary variables clogging the active memory things start to slow down.

clear
set obs 10000

* For example:
forv i = 0/10000 {
tempvar temp
gen temp' = rnormal()
if mod(i',1000)==0 di "$S_TIME" } * For me there is about 2 seconds for every 1,000 variables created * Easy fix: clear set obs 10000 forv i = 1/10000 { tempvar temp gen temp' = rnormal() * Add drop drop temp' if mod(i',1000)==0 di "$S_TIME"
}
* When cleaning the memory this reduces the run time of all 10,000 (for me) to only 2 seconds.

* This is all fine, but they only problem is that if we are going through all of the work of drawing temporary variables just to later drop them then why not just draw normal variables?

* There is only one reason I cas see.  We do not want to risk our drawn variable names to overlap with our existing variable names.

* However, if this is not a problem then the following commands are clearly that much cleaner:

clear
set obs 10000

forv i = 1/10000 {
gen temp = rnormal()
drop temp
if mod(i',1000)==0 di "\$S_TIME"
}

* In general I do not use temporary variables.

1. Overall it is not clear what should we learn from your post. Everything is pretty much described in the manual for tempvar. However, here are some comments:

Quote: "However, if we were to try to override a temporary variable with one of the same name then it will not work."

That is applicable to all variables, not just temporary.
sysuse auto
generate price=7654
will fail since price already exists.

No problem, there is a command for that: replace

In your loop you can reuse the temporary variables as needed. It is a very bad idea to make number of temp variables proportionate to the number of loop iterations.

Quote: "However, it is impossible to target previous temporary variables that were created..."
Just append the tempvar name to a list

sysuse auto, clear
forval i=1/10 {
tempvar temp
local mytemps "mytemps' temp'"'
generate temp'=i'
sum `mytemps'
}

Stata will drop the temp vars when the scope ends. You can however drop them earlier to save memory if needed.

S.R.

1. I like what you are saying. I see the logic to tempvars.

My personal opinion is that they introduce a level of opaqueness that I would rather do without. In general I find the notation unnecessarily complex so I shy away from them. But I can definitely see why some people use them.