## Friday, July 20, 2012

### 7 commands in Stata and R

* In Stata we should clear the memory and set the number of observations
clear
set obs 1000

* 1. Random draws first let's generate some data (random normal draws)

gen x = rnormal()
gen u = rnormal()

gen y = x + u

* 2. OLS

reg y x

* 3. Huber White

reg y x, robust

* 4. Apply cumulative normal function

gen yprob = normal(y)

* 5. Generate bernoulli draws (a binomial is a bernoulli distribution when n = 1)

gen ybernoulli = rbinomial(1,yprob)

* 6. Use a probit estimation to estimate y

probit ybernoulli x

* 7. Logit

logit ybernoulli x

* We can see that Stata is easy to use and efficient in its commands

####################################
#### Now the same commands in R ####

# It is unnessecary to clear memory in R.  However every time we generate new data we need to tell it how many observations.

# 1. Random draws first let's generate some data (random normal draws)

x = rnorm(1000)
u = rnorm(1000)
# The 1000 tells R to generate vectors of random variables 1000 long.

y = x + u

# 2. OLS

result=lm(y~x)

# To get details
summary(result)

# 3. Huber White Heteroskedastic Corrected Standard Errors

# As far as I can tell this is not a default command in R so we need an additional package.
# This is easy enough to load:

require(car)

# Now for the adjusted standard errors
hccm(result)

# 4. Apply cumulative normal function

yprob = pnorm(y)

# 5. Generate bernoulli draws (a binomial is a bernoulli distribution when n = 1)

ybernoulli = rbinom(1000,1,yprob)
# Remember we need to specify 1000 so that R knows how many random draws to make

# 6. Use a probit estimation to estimate y

myprobit = glm(ybernoulli ~ x, family=binomial(link="probit"))

summary(myprobit)

# 7. Logit

mylogit = glm(ybernoulli ~ x, family=binomial(link="logit"))

summary(mylogit)

# We can see that while most commands are very similar to those of Stata the overall setup is more complicated.
# Which is not neccessarily a bad thing.  It is generally much easier to program (anything more advanced than one liners) in R.