## Saturday, September 8, 2012

### Mata speed gains over Stata

* The inclusion of Mata as an available alternative programming language for Stata users was a great move by Stata.

* Mata in general runs much quicker than programming on the surface level in Stata.

* In Stata each loop that runs is compiled (interpretted into machine code) as it runs creating a lot of work for the machine.

* In Mata on the other hand, the entire loop is compiled prior to running.

* Let's see how this works.

* Let's say we want to add up the square of the numbers 1 through 100000

* Method 1: Surface loop

timer clear 1
timer on 1
local x2 = 0

forv i = 1/1000000 {
local x2 = x2'+i'^2
}

di `x2'

timer off 1
timer list 1

* On my laptop, this takes about 13.5 seconds

* Method 2: Mata loop
timer clear 1
timer on 1
mata
x2=0
// This command can be read as start i at 1,
// keep looping so long as i is less than 1000000,
// the third argument looks a little fishy but it is syntax
// that has been around for a while (at least since C).
// It would be identical to writing i=i+1, in other words, add 1 to i.
// Following the for loop we can immediately place a since line command.
for (i = 1; i <= 1000000; i++) x2=x2+i^2
// If there is nothing done with the value x2 then mata displays this value.
// R handles this identically
x2
end

timer off 1
timer list 1

* In contrast, my computer completed the loop using mata in .27 seconds, many magnitudes of speed faster.

* However this does not mean you need to learn to use mata (since it has its own limitations and syntax) in order to speed up your commands.

* Method 3: Use Stata's data structure to accomplish vector tasks
timer clear 1
timer on 1

clear
set obs 1000000
gen x2 = _n^2

* The sum command will calculate the mean of x2 which is the same as the sum of x2 divided by it's number of observations.
sum x2
* We can reverse that operation easily.
di r(N)*r(mean)

timer off 1
timer list 1
* Using a little knowledge of how Stata stores post command information this method does the same trick in .2 seconds

* Method 4: The speed gains in 3 was as a result of using the vector structure of data columns.  Mata can do very similar things even easier.

timer clear 1
timer on 1
// This command looks a little fishy, but it is easy to understand.
// Order of operations must be taken into account.
// First the 10^6 is evaluated which equals 1000000
// Then the vector 1..10^6 is made which looks like 1 2 3 ... 1000000
// The .. tells mata to make a count vector.
// Once the vector is made then the command :^2 tells stata to do a piece wise squaring of each term in the vector.
// Finally the sum command adds all of the elements of the vector together to generate the result we were looking for.
mata: sum((1..10^6):^2)
timer off 1
timer list 1
* The result is that this command only took .04 seconds to run through efficient coding in Mata.

# As a matter of comparison, this command
system.time(sum((1:10^6)^2))
# took .04 seconds in R

# And the loop:
x=0
system.time(for(i in 1:10^6) x=x+i^2)
# 1.3 seconds

# Thus Mata in this example is significantly faster than Stata and about the same speed as R.

#### 1 comment :

1. Testing it on my machine, using a scalar instead of a local in method 1 seems faster, though still significantly slower than the other methods.

In method 3, -summarize- with option -meanonly- should be slightly faster. But I think you can skip -summarize-, which either way is going to do more calculations than you need:

clear
set obs 1000000
gen x2 = sum(_n^2)
di x2[_N]