Thursday, October 31, 2013

Tobit fitted values not "fitting" data

* I was recently asked by a reader why it might be that the predicted values from 
* a tobit regression might have a constant significantly below zero and many fitted
* values unrealistically below zero.

* Is that a problem?

* Let's do a simple simulation of what the Tobit might be doing:

clear
set obs 10000

gen x = rnormal()

gen y_true = -2 + x*2 + rnormal()*4

gen y_observed = y_true
replace y_observed = 0 if y_true < 0

hist y_ob



tobit y_ob x, ll(0)

predict y_hat

graph twoway  (scatter y_hat y_true) (lfitci y_hat y_true)

* Not the best fit but, okay.

* So the take away. It is not inconsistent with the Tobit model at all that
* the constant and many fitted values may be significantly below zero.

* In a way, that is typically the result when it is most important to use a Tobit
* because you observe few values which are positive indicating that the underlying
* function is typically having many values fit below zero which have been censored.

* When you have the case in which the constant and many fitted values above zero,
* you have the least gains in terms of reducing bias from using a Tobit estimation
* method.

Formatted By Econometrics by Simulation

Wednesday, October 30, 2013

Using Tobit to Impute Censored Regressors

* Imagine that you have some data set in which one or more of your explanatory variables
* is censored (that is x>alpha is reported as alpha).  This type of censoring is typical
* of some surveys such as income surveys when it might be possible to identify a person
* because within a demographic unit there are too few people who make above a certain
* high income level.

* In this simulation I ask the question, is there a problem with using a Tobit to "correct"
* for the censoring and then using the generated regressor as a predictor.  This simulation
* will not of course give a general answer to that question but it will at least give
* a suggestion as to if in generated data things become problematic.

* First let's imagine that we have a decent set of observable variables x1-x10 which are 
* correlated.

clear

set obs 1000

* First let's generate a correlation matrix.
* There must be an easier way but I use two steps to first generate vector C.
* Then expand it into a matrix with all non-diagnol elements equal to .5
matrix C=1,1,1,1,1,1,1,1,1,1
matrix B=C'*C*.5 + I(10)*.5
matrix list B

* Draw the explanatory variables
drawnorm x1 x2 x3 x4 x5 x6 x7 x8 x9 x10, corr(B)

* Things seem to have gone well
corr x1-x10

* First let's imagine there is a dependent variable:
* y=b0 + x1*b1 + x2*b2 + x3*b3 + x4*b4 + x5*b5 + u

gen y = -5 + x1*1 + x2*(-2) + x3*(1) + x4*(3) + x5*(1) + rnormal()*2

* Note: I am intentionally making the noise unrealistically low
* so that it is easier to identify the bias in the model without
* running repeated simulations.

* Before we do anything let's verify that OLS seems to be working well.
reg y x1-x5

* In general our standard regression seems to be working as expected 
* (that is very well).

* Now let's wonder what happens if our explanatory variables x1 and x2
* are censored.

gen x1c = x1
replace x1c = .25 if x1c > .25

hist x1c



gen x2c = x2
replace x2c = .25 if x2c > .25

reg y x1c x2c x3-x5
* We can see that both x1c and x2c are now biased.  This is because
* the error is now positively correlated with x1c and negatively 
* correlated with x2c.

* We can see this: E(y|x1c, x2c, x3-x5) = -5 + x1c*1 + x2c*(-2) + x3*(1) +
* ... x5*(1) + e

* e = (x1 - x1c)*1 + (x2 - x2c)*(-2) + u

* corr(x1 - x1c, x1c) > 0 

gen x1diff = x1-x1c
corr x1diff x1c

* Likewise corr(x2 - x2c, x2c) > 0 

gen x2diff = x2-x2c
corr x2diff x2c

* Causing negative bias in the estimator since the cofficient is negative.

* The correction I propose is to impute the topcoded values using the
* variables available.

tobit x1c x2c x3-x10, ul(.25)
predict x1hat

hist(x1hat)



* Let's check the correlation between x1 and x1hat
corr x1 x1hat

* Not as good as could be hoped.
gen x1imp = x1c
replace x1imp = x1hat if x1c == .25 & x1hat > .25

scatter x1imp x1c



* The same thing with x2c
tobit x2c x1imp x3-x10, ul(.25)
predict x2hat
gen x2imp = x2c
replace x1imp = x1hat if x1c == .25 & x1hat >= .25
* Since x1 is cencored at .25 we know that only x1hat > .25 can
* be a better match

* Now let's try to estimate our model given our new variables.
reg y x1imp x2imp x3-x5

* Even if this method seemed to work we would need to 
* boostrap or otherwise adjust the standard errors to account
* for the fact that we are now using generated regressors
* x1imp and x2imp

* We do not have sufficient information at this point to see if
* our imputation method is improving our estimates at all.

* We need to repeate the whole simulation a number of times to hope
* to answer that question.

cap program drop tobit_impute
program define tobit_impute, rclass

  clear
  set obs 1000
  drawnorm x1 x2 x3 x4 x5 x6 x7 x8 x9 x10, corr(B)
  gen y = -5 + x1*1 + x2*(-2) + x3*(1) + x4*(3) + x5*(1) + rnormal()*2

  reg y x1-x5
  return scalar x1_ols = _b[x1]
  return scalar x2_ols = _b[x2]
  return scalar x3_ols = _b[x3]
  return scalar x4_ols = _b[x4]
  return scalar x5_ols = _b[x5]

  gen x1c = x1
  replace x1c = .25 if x1c > .25

  gen x2c = x2
  replace x2c = .25 if x2c > .25

  reg y x1c x2c x3-x5
  return scalar x1_ols2 = _b[x1c]
  return scalar x2_ols2 = _b[x2c]
  return scalar x3_ols2 = _b[x3]
  return scalar x4_ols2 = _b[x4]
  return scalar x5_ols2 = _b[x5]

  tobit x1c x2c x3-x10, ul(.25)
  predict x1hat

  gen x1imp = x1c
  replace x1imp = x1hat if x1c == .25

  tobit x2c x1imp x3-x10, ul(.25)
  predict x2hat
  gen x2imp = x2c
  replace x2imp = x2hat if x2c == .25

  reg y x1imp x2imp x3-x5
  return scalar x1_ols3 = _b[x1imp]
  return scalar x2_ols3 = _b[x2imp]
  return scalar x3_ols3 = _b[x3]
  return scalar x4_ols3 = _b[x4]
  return scalar x5_ols3 = _b[x5]

end

tobit_impute

simulate  /// 
  x1_ols=r(x1_ols) x2_ols=r(x2_ols) x3_ols=r(x3_ols)  /// 
  x4_ols=r(x4_ols) x5_ols=r(x5_ols)  /// 
  x1_ols2=r(x1_ols2) x2_ols2=r(x2_ols2) x3_ols2=r(x3_ols2)  /// 
  x4_ols2=r(x4_ols2) x5_ols2=r(x5_ols2)  /// 
  x1_ols3=r(x1_ols3) x2_ols3=r(x2_ols3) x3_ols3=r(x3_ols3)  /// 
  x4_ols3=r(x4_ols3) x5_ols3=r(x5_ols3)  /// 
  , rep(300): tobit_impute

sum

/*

    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
      x1_ols |       300    .9975067    .0809674   .7533615   1.266807
      x2_ols |       300   -1.998138    .0794069  -2.231982  -1.778604
      x3_ols |       300    .9959236    .0788032   .7804456   1.203429
      x4_ols |       300    3.003273    .0836668   2.726592   3.262676
      x5_ols |       300    .9992972    .0808832   .7657179   1.212335
-------------+--------------------------------------------------------
     x1_ols2 |       300    1.223525    .1201054   .8844188   1.606213
     x2_ols2 |       300   -2.417704    .1229109  -2.758024  -2.092011
     x3_ols2 |       300    .9224241    .0872901   .7420031   1.168221
     x4_ols2 |       300    2.930535    .0910089   2.642824   3.200319
     x5_ols2 |       300    .9313811    .0863095    .667203   1.172618
-------------+--------------------------------------------------------
     x1_ols3 |       300    1.264906    .1325387   .8362417   1.708511
     x2_ols3 |       300   -2.358331    .1368772  -2.704609   -1.99809
     x3_ols3 |       300     1.01038    .0971799   .7765424    1.26452
     x4_ols3 |       300    3.016996    .0963383   2.688083   3.329581
     x5_ols3 |       300    1.019813    .0943988   .7494624   1.267622
*/

* In summary it seems that our estimate on x2 is slightly improved
* while our estimate on x1 is made worse.  Overall our estimates are
* not really any better suggesting that using the tobit imputation
* failed to correct in a substantive way our censorship problems.

Formatted By Econometrics by Simulation

Tuesday, October 29, 2013

A Call for Contributors

Dear Readers,

This blog has been going for a year and a half now and I have posted nearly 300 posts to date.  I plan to continue to post though I am going to probably continue to shift my focus from daily blogging to that of getting a dissertation together within the next year.  In addition, I would like this blog to extend resources to those whose knowledge extends beyond the small realm of understanding I currently poses.

Therefore, I would like to make an open invitation to any of my readers who would like to become a contributing author.  I believe the format of the blog is well established.  If you have any ideas that you think would be appropriate, please feel free to submit your posts to be evaluated for submission as a guest blogger.

The average hit rate is around one thousand hits per day and probably many more if you include the syndication readership.

Thank you for your consideration,

Francis Smart

Monday, October 28, 2013

Modified Bin and Union Method for Item Pool Design

# Reckase (2003) proposes a method for designing an item pool for a computer
# adaptive test that has been known as the bin and union method.  This method
# involves drawing a subject from a distribution of abilities. Then selecting
# the item that maximizes that subject's information from the possible set of 
# all items given a standard CAT proceedure. This is repeated until the test
# reaches the predifined stopping point.
 
# Then then next subject is drawn and a new set of items is drawn.  Items are
# divided into bins such that there is a kind of rounding.  Items which are
# sufficiently close to other items it terms of parameter fit are considered
# the same item and the two sets are unionized together into a larger pool.
 
# As more subjects are added more items are collected though at a decreasing
# rate as fewer new items become neccessary.
 
# In the original paper he uses a fixed length test though in a forthcoming 
# paper he and his student Wei He is also using a variable length test.
 
# I have modified his proceedure slightly in this simulation.  Rather than
# selecting optimal items for each subject based from the continuous pool
# of possible items I have the test look within the already constructed pool
# to see if any items are within bin length of the subject's estimated ability.
 
# If there is no item then I add an item that perfectly matches the subject's
# estimated ability. The reason I prefer this method is that I think it better
# represents the process that a CAT test typically must go through with items
# close to but rarely exactly at the level of the subjects. Thus the information
# for each subject will be slightly less as a result of this modified method
# relative to the original.
 
# As with the new paper this simulation uses a variable length test. My stopping
# rule is simple.  Once the test achieves a sufficiently high level of 
# information, then it stops.
 
# I have constructed this simulation as one with three nested loops.
# Over subjects within the item pool construction.
 
# It simulates the item pool construction a number of times to get the
# average number of items after each subject as well as a histogram
# of average number of items required at each difficulty level.
 
# I have also included a control for item exposure.  This control 
# dicatates that as the acceptable exposure rate is reduced, more items
# will be required since some are too frequently exposed.
 
# Overall this method is seems pretty great to me.  It allows for
# item selection criteria, stopping rules, and exposure controls
# to be easily modified to accomidate most any CAT design.
 
require("catR")
 
# Variable Length Test
 
# The number of times to repeat the simulation
nsim <- 10
 
# The number of subjects to simulate
npop <- 1000
 
# The maximum number of items
max.items <- 5000
 
# Maximum exposure rate of individual item
max.exposure <- .2
 
# Stop the test when information reaches this level
min.information <- 10
 
# How far away will the program reach for a new item (b-b_ideal)
bin.width <- .25
 
expect.a <- 1
 
p <- function(theta, b) exp(theta-b)/(1+exp(theta-b))
info <- function(theta, b, a=expect.a) p(theta,b)*(1-p(theta,b))*a^2
 
info(0,0)
 
# The choose.item funciton takes an input thetahat and searches
# available items to see if any already exist that can be used
# otherwise it finds a new item.
choose.item <- function(thetahat, item.b, items.unavailable, bin.width) {
  # Construct a vector of indexes of available items
  avail.n <- (1:length(item.b))
 
  # Remove any already make unusuable
  if (length(items.unavailable)>0) 
    avail.n <- (1:length(item.b))[-items.unavailable]
 
  # If there are no items available then generate the next item
  # equal to thetaest.
  if (length(avail.n)==0) 
    return(c(next.b=thetahat, next.n=length(item.b)+1))
 
  # Figure out how far each item is from thetahat
  avail.dist <- abs(item.b[avail.n]-thetahat)
 
  # Reorder the n's and dist in terms of proximity
  avail.n <- avail.n[order(avail.dist)]
  avail.dist <- sort(avail.dist)
 
  # If the closest item is within the bin width return it
  if (avail.dist[1]<bin.width) 
    return(c(next.b=item.b[avail.n[1]], next.n=avail.n[1]))
  # Otherwise generate a new item
  if (avail.dist[1]>=bin.width) 
    return(c(next.b=thetahat, next.n=length(item.b)+1))
}
 
# Define the simulation level vectors which will become matrices
Tnitems <- Ttest.length <- Titems.taken.N <- Titem.b<- NULL
 
# Loop through the number of simulations
for (j in 1:nsim) {
 
 
  # Seems to be working well
  choose.item(3, c(0,4,2,2,3.3), NULL, .5)
 
  # This is the initial item pool
  item.b <- 0
 
  # This is the initial number of items taken
  items.taken.N <- rep(0,max.items)
 
  # A vector to record the individual test lengths 
  test.length <- NULL
 
  # Number of total items after each individual
  nitems <- NULL
 
  # Draw theta from a population distribution
  theta.pop <- rnorm(npop)
 
  # Start the individual test
  for (i in 1:npop) {
    # The this person has a theta of:
    theta0 <- theta.pop[i]
 
    # Our initial guess at theta = 0
    thetahat <- 0
 
    print(paste("Subject:", i,"- Item Pool:", length(item.b)))
    response <- items.taken <- NULL
 
    # Remove any items that would have been overexposed
    items.unavailable <- (1:length(item.b))[!(items.taken.N < max.exposure*npop)]
 
    # The initial imformation on each subject is zero
    infosum <- 0
 
    # Loop through each subject
    while(infosum < min.information) {
 
      chooser <- choose.item(thetahat, item.b, items.unavailable, bin.width)
 
      nextitem <- chooser[2]
      nextb <- chooser[1]
        names(nextitem) <- names(nextb) <- NULL
 
      items.unavailable <- c(items.unavailable,nextitem)
      item.b[nextitem] <- nextb
 
      response <- c(response, runif(1)<p(theta0, nextb))
 
      items.taken <- c(items.taken, nextitem)
 
      it <- cbind(1, item.b[items.taken], 0,1)
 
      thetahat <- thetaEst(it, response)
 
      infosum <- infosum+info(theta0, nextb)
    }
 
    # Save individual values
    nitems <- c(nitems, length(item.b))
 
    test.length <- c(test.length, length(response))
 
    items.taken.N[items.taken] <- items.taken.N[items.taken]+1
  }
 
  # Save into matrices the results of each simulation
  Titem.b <- c(Titem.b, sort(item.b))
  Tnitems <- cbind(Tnitems, nitems)
  Ttest.length <- cbind(Ttest.length, test.length)
  Titems.taken.N <- cbind(Titems.taken.N, items.taken.N)
 
}
 
plot(apply(Tnitems, 1, max), type="n",
     xlab = "N subjects", ylab = "N items",
     main = paste(nsim, "Different Simulations"))
for (i in 1:nsim) lines(Tnitems[,i], col=grey(.3+.6*i/nsim))
 

# We can see that the number of items is a function of the number of
# subjects taking the exam.  This relationship becomes relaxed
# when the number of subjects becomes large and the exposure controls
# are removed.


hist(Titem.b, breaks=30)
 
hist(Ttest.length, breaks=20)

Created by Pretty R at inside-R.org

Saturday, October 26, 2013

Principal Component Analysis

* Principal component analysis is a very interesting method that allows for 
* one to attempt to identify the underlying driving factor or compenent in
* the observable values in data.

* Imagine that you have data on demographic information about people.  This
* data tells you stuff like class rank, number of
* sports played in, salary, married, number of children, etc.

* Now let's imagine that individual data observations are a function of
* underlying latent personal traits. These traits include: intelligence,
* athleticism, and family_orientation

set seed 101

clear
set obs 1000

* Latent traits
gen inte = rnormal()
gen athl = rnormal()
gen famo = rnormal()

* Observable traits
gen class_rank = 2*inte   - .1*athl +  1*famo   + rnormal()
gen nsports    = -.5*inte +  2*athl + .5*famo   + rnormal()
gen salary     = 1*inte   + .5*athl -  1*famo   + rnormal()
gen married    = .1*inte  + .5*athl +  1.5*famo + rnormal()
gen children   = -.5*inte +  0*athl +  2*famo   + rnormal()

* Now let us attempt to identify our latent traits

pca  class_rank nsports salary married children

screeplot

predict lt1 lt2 lt3
* This will generate a variable that respresents the latent trait
* estimates from the principal component analysis.

corr lt1 lt2 lt3 inte athl famo
* By correlating the latent traits with the the pca generated 
* variables we are able to test how well the pca analysis is working.

* We can see that the first latent component identified is famo 
* (family orientation) followed by intelligence and then athletics.

* It is important to note that while in practice family orientation,
* intelligence, and athletics can be correlated principal component
* analysis would have difficulty identify them if it did since it
* importantly relies upon identifying orthogonal components.

Formatted By Econometrics by Simulation

Wednesday, October 9, 2013

Weak Instruments with a Small Panel

This post is in response to a comment by Luis Enrique on my previous post:

"As somebody who regularly consumes cross-country empirical research based on IV
regressions with samples of 50-100, I found this quite alarming. But then most
of the papers I read will be panel, with T of let's say 50.

This question may reveal shocking ignorance, but if the number of observations
in a panel (N*T) is say 100 * 50, does that translate into a (very) safe
sample size?" - Luis

My response:

Thanks for asking. I am no expert on time series so my opinion should really
be regarded with a grain of salt. First off, the key to remember is that the
problem is the result of a weak instrument. As the first stage R-squared on
the 2SLS approaches 1 the IV estimator becomes the OLS estimator which has
the same properties of OLS.

However, as the first stage R-squared approaches zero we start suffering
serious problems with IV in terms of both bias and efficiency.

Based on the cross sectional simulation, I would say that given a similarly
 weak instrument, if each year's observation within each country is independent
then a panel of 100*50 should not be a problem. However, we expect explanatory
variables to be serially correlated and instruments to be often a 1 time only policy
changes which are extremely serially correlated which reduces the effective sample size
since each observation of the instrument can no longer be seen as independent.

Let's see how data generated in this manner may behave.

* First we define the original weakreg simulation in which there is 
* only cross sectional data.

cap program drop weakreg
program weakreg, rclass
  clear
  set obs `1' 
  * The first argument of the weakreg command is the number of 
  *  observations to draw.
  gen z = rnormal()
  gen w = rnormal()
  gen x = z*.2 + rnormal() + w
  gen u = rnormal()
  gen y = x + w + u*5
  reg y x
    return scalar reg_x = _b[x]
 return scalar reg_se_x = _se[x]
  ivreg y (x=z)
    return scalar ivreg_x = _b[x]
 return scalar iv_se_x = _se[x]
end 

* Now we define the original weakreg2 simulation in which there is 
* panel data.

cap program drop weakreg2
program weakreg2, rclass
  clear
  set obs `1' 
  * The first argument of the weakreg command is the number of 
  *  clusters to draw.
  gen id = _n

  * There is a one time policy change
  gen z = rnormal()

  * The second argument is the number of observations in each cluster
  expand `2' 
  
  bysort id: gen t = _n
  
  gen w = rnormal()
  
  * There is no policy effect before half way through the time period.
  replace z = 0 if (t < `2'/2)

  gen x = z*.2 + rnormal() + w
  
  gen u = rnormal()
  
  gen y = x + w + u*5 if t==1
  bysort id: replace y = x + w + u*12.5^.5 + u[_n-1]*12.5^.5 if t>1
  
  reg y x
    return scalar reg_x = _b[x]
 return scalar reg_se_x = _se[x]
  ivreg y (x=z), cluster(id)
    return scalar ivreg_x = _b[x]
 return scalar iv_se_x = _se[x]
end 

* Looking at first the cross sectional data:
simulate reg_x=r(reg_x)     reg_se_x=r(reg_se_x)  /// 
         ivreg_x=r(ivreg_x) iv_se_x=r(iv_se_x)    /// 
   , rep(1000): weakreg 5000
sum

/*
    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
       reg_x |      1000    1.490732    .0499934    1.29779   1.661524
    reg_se_x |      1000    .0500304    .0007235   .0481176   .0520854
     ivreg_x |      1000    .9938397    .3684834  -.2149411   1.970256
     iv_se_x |      1000    .3638622     .037906   .2568952   .5094048

*/

* We see everything seems to be working very well.

* Looking now at the case where there is 50 clusters and 100 observations
* in each of them.

simulate reg_x=r(reg_x)     reg_se_x=r(reg_se_x)  /// 
         ivreg_x=r(ivreg_x) iv_se_x=r(iv_se_x)    /// 
   , rep(1000): weakreg2 50 100
sum

/*
    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
       reg_x |      1000    1.493905    .0508205   1.269596   1.670965
    reg_se_x |      1000    .0502538    .0007578   .0478791   .0531899
     ivreg_x |      1000    1.021512    .7299972  -1.083777    3.64094
     iv_se_x |      1000    .7110581    .1806363   .3299663   1.566804
*/

* We can see that there is no huge bias in the ivregression though the
* standard errors are about twice that of the cross sectional data.

* Looking at first the cross sectional data:
simulate reg_x=r(reg_x)     reg_se_x=r(reg_se_x)  /// 
         ivreg_x=r(ivreg_x) iv_se_x=r(iv_se_x)    /// 
   , rep(1000): weakreg 1000
sum

/*
    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
       reg_x |      1000    1.483811    .1117863   1.130268     1.8281
    reg_se_x |      1000    .1121205    .0034219   .1026788      .1249
     ivreg_x |      1000    .9028068    .9626959  -8.379108   4.424785
     iv_se_x |      1000    .9215948    .7681726   .4671113   23.61619
*/

* We can see the IV estimator has a large variance and a significant bias
* though it seems to be doing well in general.

simulate reg_x=r(reg_x)     reg_se_x=r(reg_se_x)  /// 
         ivreg_x=r(ivreg_x) iv_se_x=r(iv_se_x)    /// 
   , rep(1000): weakreg2 50 20
sum

/* 
    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
       reg_x |      1000     1.49184    .1104733   1.148331   1.927401
    reg_se_x |      1000    .1125144    .0040137   .1009569   .1271414
     ivreg_x |      1000    .8209362    5.213284  -151.3602   17.03425
     iv_se_x |      1000    10.00779    257.6449   .5320991   8149.177
*/

* We can see the IV estimator though dealing with the same number of 
* observations has a huge variance and an even larger bias!
* The take away is that even small panels seem like they work so long
* as there is sufficient observations over time.
Formatted By Econometrics by Simulation

Tuesday, October 8, 2013

Finite Sample Properties of IV - Weak Instrument Bias

* There is no proof that an instrumental variables (IV) estimator is unbiased.

* In fact we know that in small enough samples the bias can be large.

* Let's see a simple setup with the endogeneity a result of omitted variable bias.

* Our instrument is valid, though biased because we are using a "small" sample and the instrument is weak.

clear
set obs 1000

gen z = rnormal()

gen w = rnormal()

gen x = z*.3 + rnormal() + w

gen u = rnormal()

gen y = x + w + u*5

reg y x

ivreg y (x=z)
* IVreg includes the true estimate in the confidence interval though the interval is quite wide.

* This is largely the result of z being a weak instrument for x
reg x z

* There is a conjecture that the IV estimator is biased in finite samples.

* In order to examine this bias we will run a monte carlo 
*  simulation to see how biased our estimates are at each level.

cap program drop weakreg
program weakreg, rclass
  clear
  set obs `1' 
  * The first argument of the weakreg command is the number of 
  *  observations to draw.
  gen z = rnormal()
  gen w = rnormal()
  gen x = z*.2 + rnormal() + w
  gen u = rnormal()
  gen y = x + w + u*5
  reg y x
    return scalar reg_x = _b[x]
 return scalar reg_se_x = _se[x]
  ivreg y (x=z)
    return scalar ivreg_x = _b[x]
 return scalar iv_se_x = _se[x]
end 

* With only 100 observations
simulate reg_x=r(reg_x)     reg_se_x=r(reg_se_x)  /// 
         ivreg_x=r(ivreg_x) iv_se_x=r(iv_se_x)    /// 
   , rep(10000): weakreg 100
sum
/*
    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
       reg_x |     10000    1.484123    .3619725   .1481085   2.739473
    reg_se_x |     10000     .357267    .0359624   .2261475   .5157685
     ivreg_x |     10000    14.45544    846.8431  -1318.675   78604.67
     iv_se_x |     10000    208878.2    1.93e+07   .6884335   1.92e+09
*/

* We can see the mean standard error estimate is much
* larger than the standard deviation of the estimates.

* In addition, the apparent bias of the IV is huge!
* Thus OLS is the better estimator in this case.

simulate reg_x=r(reg_x)     reg_se_x=r(reg_se_x)  /// 
         ivreg_x=r(ivreg_x) iv_se_x=r(iv_se_x)    /// 
   , rep(10000): weakreg 300
sum

/*

    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
       reg_x |     10000     1.48824    .2065498   .6367978   2.236391
    reg_se_x |     10000     .204807    .0117558   .1625977   .2537656
     ivreg_x |     10000     .883504    16.43395  -489.2355   1346.839
     iv_se_x |     10000     103.928    7065.172   .5418385   696229.8

*/
* Increasing the sample size to 300 vastly improves the IV estimator.
* Though it is now downward biased.

simulate reg_x=r(reg_x)     reg_se_x=r(reg_se_x)  /// 
         ivreg_x=r(ivreg_x) iv_se_x=r(iv_se_x)    /// 
   , rep(10000): weakreg 500
sum

/*
    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
       reg_x |     10000    1.490552     .159073   .9184232   2.088732
    reg_se_x |     10000    .1584161    .0071414   .1326629   .1896385
     ivreg_x |     10000     .841985    4.533252  -337.8417   41.04016
     iv_se_x |     10000    10.02729    672.1545   .5350561   66082.69
*/
* Increasing the sample size to 500 does not seem to improve the bias 
* of the IV estimator. Though the standard errors on average seem to be
* getting closer to the standard deviations of the estimators.

simulate reg_x=r(reg_x)     reg_se_x=r(reg_se_x)  /// 
         ivreg_x=r(ivreg_x) iv_se_x=r(iv_se_x)    /// 
   , rep(10000): weakreg 750
sum

/*
    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
       reg_x |     10000    1.491437    .1291091   1.012212   1.998231
    reg_se_x |     10000    .1292925    .0047437   .1129745   .1498492
     ivreg_x |     10000    .9714284    1.087322  -12.33198   7.718197
     iv_se_x |     10000    1.080065    .8625042   .4623444   39.95131
*/

* Increasing the sample size to 750 dramatically improves the IV estimator.
* It is still slightly biased but that is not a huge problem.
* Now the standard errors are working very well as well.
* The only problem would be the IV estimator still has such large variation
* that both the OLS estimator and the 0 coefficient would be included in
* most confidence intervals.

simulate reg_x=r(reg_x)     reg_se_x=r(reg_se_x)  /// 
         ivreg_x=r(ivreg_x) iv_se_x=r(iv_se_x)    /// 
   , rep(10000): weakreg 1000
sum

/*
    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
       reg_x |     10000    1.488341    .1107187   1.064637   1.938462
    reg_se_x |     10000     .111871    .0035312   .1001198   .1255203
     ivreg_x |     10000    .9691499    .8924782  -5.659174    5.94314
     iv_se_x |     10000    .8812981    .2897863     .44236   6.674775
*/

* We can see that our primary gains from more observations is a smaller
* standard error.

Formatted By Econometrics by Simulation