* The inclusion of Mata as an available alternative programming language for Stata users was a great move by Stata.

* Mata in general runs much quicker than programming on the surface level in Stata.

* In Stata each loop that runs is compiled (interpretted into machine code) as it runs creating a lot of work for the machine.

* In Mata on the other hand, the entire loop is compiled prior to running.

* Let's see how this works.

* Let's say we want to add up the square of the numbers 1 through 100000

* Method 1: Surface loop

timer clear 1

timer on 1

local x2 = 0

forv i = 1/1000000 {

local x2 = `x2'+`i'^2

}

di `x2'

timer off 1

timer list 1

* On my laptop, this takes about 13.5 seconds

* Method 2: Mata loop

timer clear 1

timer on 1

mata

x2=0

// This command can be read as start i at 1,

// keep looping so long as i is less than 1000000,

// the third argument looks a little fishy but it is syntax

// that has been around for a while (at least since C).

// It would be identical to writing i=i+1, in other words, add 1 to i.

// Following the for loop we can immediately place a since line command.

for (i = 1; i <= 1000000; i++) x2=x2+i^2

// If there is nothing done with the value x2 then mata displays this value.

// R handles this identically

x2

end

timer off 1

timer list 1

* In contrast, my computer completed the loop using mata in .27 seconds, many magnitudes of speed faster.

* However this does not mean you need to learn to use mata (since it has its own limitations and syntax) in order to speed up your commands.

* Method 3: Use Stata's data structure to accomplish vector tasks

timer clear 1

timer on 1

clear

set obs 1000000

gen x2 = _n^2

* The sum command will calculate the mean of x2 which is the same as the sum of x2 divided by it's number of observations.

sum x2

* We can reverse that operation easily.

di r(N)*r(mean)

timer off 1

timer list 1

* Using a little knowledge of how Stata stores post command information this method does the same trick in .2 seconds

* Method 4: The speed gains in 3 was as a result of using the vector structure of data columns. Mata can do very similar things even easier.

timer clear 1

timer on 1

// This command looks a little fishy, but it is easy to understand.

// Order of operations must be taken into account.

// First the 10^6 is evaluated which equals 1000000

// Then the vector 1..10^6 is made which looks like 1 2 3 ... 1000000

// The .. tells mata to make a count vector.

// If I had written :: then mata would have made a column vector instead.

// Once the vector is made then the command :^2 tells stata to do a piece wise squaring of each term in the vector.

// Finally the sum command adds all of the elements of the vector together to generate the result we were looking for.

mata: sum((1..10^6):^2)

timer off 1

timer list 1

* The result is that this command only took .04 seconds to run through efficient coding in Mata.

# As a matter of comparison, this command

system.time(sum((1:10^6)^2))

# took .04 seconds in R

# And the loop:

x=0

system.time(for(i in 1:10^6) x=x+i^2)

# 1.3 seconds

# Thus Mata in this example is significantly faster than Stata and about the same speed as R.

Testing it on my machine, using a scalar instead of a local in method 1 seems faster, though still significantly slower than the other methods.

ReplyDeleteIn method 3, -summarize- with option -meanonly- should be slightly faster. But I think you can skip -summarize-, which either way is going to do more calculations than you need:

clear

set obs 1000000

gen x2 = sum(_n^2)

di x2[_N]