Friday, November 23, 2012

Computer Adaptive Test Assuming an Infinite Item Pool

Original Code

# In order for us to understand what a Computer Adaptive Test (CAT) is, let's first think about how the CAT works.

# A CAT test starts by having some kind of initial assessment of student ability (test taker's ability).

# This is typically at the population mean.

# The test then selects an item that (in the most straightforward case) has the most information at that initial guess.

# If the student answers that question correctly then the program reassesses student ability and finds the next question which has the most information at the new assessment of student ability.

# The computer continues to select items until the termination conditions are met.  These conditions might be anything from a fixed length of time for the test, a fixed number of questions for the test, or more interestingly a sufficient level of precision achieved for student ability.  See flow chart:

# In order to asses student ability I will use code form:

http://www.econometricsbysimulation.com/2012/11/estimating-person-characteristics-from.html

#########################################################
# Specify the initial conditions:

true.ability = rnorm(1)

# Load three parameter model ICC
PL3 = function(theta,a, b, c) c+(1-c)*exp(a*(theta-b))/(1+exp(a*(theta-b)))

# Load three parameter item information:
PL3.info = function(theta, a, b, c) a^2 *(PL3(theta,a,b,c)-c)^2/(1-c)^2 * (1-PL3(theta,a,b,c))/PL3(theta,a,b,c)

#########################################################
#  Mock Computer Adaptive Test

# First let's specify and initial guess

est.start = 0

# How much do we adjust our estimate of person ability when all answer's are either right or wrong.
est.jump = .7

# Number of items on the test. This will be the end condition.
num.items = 50

# Set the other parameters in the 3PL
a.base = 3
c.base = .1

# Let's generate a vector to hold ability estimates.
ability.est <- est.start="est.start" p="p">
# Let's first generate a empty data frame to hold the set of item taken.
items <- a="NA,b=NA,c=NA,response=NA,p=NA," ability.est="NA)</p" data.frame="data.frame">
i = 1

# For this first mock test we will not select items from a pool but instead assume the pool is infinite and has an item with an a=a.base, c=c.base, and b equal to whatever the current guess is.

# Let's select our first item - a,b,c, response are scalars that will be reused to simplify coding.
a=a.base
c=c.base
b=ability.est[i]

# Probability of getting the item correct
p=PL3(true.ability, a,b,c)
response = runif(1) < p
# The Item Characteristic Curve (ICC) gives the probability of getting the item correct.
# Thus, a .9 is the max of the runifrom that should produce a response of TRUE or correct or 1 (as far as R is concerned these TRUE and 1 are the same as is FALSE and 0)

items[i,] = c(a=a, b=b, c=c, response=response, p=p, ability.est=ability.est[i])

# We have now successfully administered our first item.

# Should do our first MLE estimation?

# Not quite, unfortunately MLE requires in the bianary case that the student has answered at least one question right and at least one question wrong.

i=1+1

# Instead we will just adjust the ability estimate by the fixed factor (est.jump)
ability.est[i] = ability.est[i-1]-(-1)^(response)*est.jump

# Now we administer the second item:
# We will continue this until we get some heterogeneity in the responses
response.v = items\$response
response.ave = sum(response.v)/length(response.v)

while ((response.ave == ceiling(response.ave)) & (num.items >= i) ) {
# This condition will no longer be true when at least one of the items is ansered correctly and one of the items answered incorrectly.
ability.est[i] = ability.est[i-1]-(-1)^(response)*est.jump

a=a.base
c=c.base
b=ability.est[i]
p=PL3(true.ability, a,b,c)

response = runif(1) < p

items[i,] = c(a=a, b=b, c=c, response=response, p=p, ability.est=ability.est[i])

response.v = items\$response
response.ave = sum(response.v)/length(response.v)

i=i+1
}

items
true.ability

# Now that we have some heterogeneity of responses we can use the MLE estimator
MLE = function(theta) sum(log((items\$response==T)*PL3(theta, items\$a, items\$b, items\$c) +
(items\$response==F)*(1-PL3(theta, items\$a, items\$b, items\$c))))

optim(0,MLE, method="Brent", lower=-6, upper=6, control=list(fnscale = -1))
# Okay, it seems to be working properly  now we will loop through using the above function.

# The only thing we need change is the ability estimate.
while (num.items >= i) {
ability.est[i] = optim(0,MLE, method="Brent", lower=-6, upper=6, control=list(fnscale = -1))\$par

a=a.base
c=c.base
b=ability.est[i]

p=PL3(true.ability, a,b,c)

response = runif(1) < p

items[i,] = c(a=a, b=b, c=c, response=response, p=p, ability.est=ability.est[i])

response.v = items\$response
response.ave = sum(response.v)/length(response.v)

i=i+1
}

items
(ability.est[i] = optim(0,MLE, method="Brent", lower=-6, upper=6, control=list(fnscale = -1)))
true.ability

# We can see that even in this ideal scenario in which you always have appropriately difficult items with high discriminatory power and low guessing, there is a noticeable amount of error.

plot(0:num.items, ability.est, type="l", main="CAT Estimates Ideally Converge on True Ability",
ylim=c(-3,3), xlab="Number of Items Administered", ylab="Estimated Ability", lwd=2)
abline(h=true.ability, col="red", lwd=2)