mleEst {catIrt}R Documentation

Estimate Ability in IRT Models

Description

mleEst, wleEst, bmeEst, and eapEst estimate ability in IRT models. mleEst is Maximum Likelihood Information, wleEst is Weighted Likelihood Information (see Details), bmeEst is Bayesian-Modal Estimation, and eapEst is Expected-A-Posterior Estimation.

Usage

mleEst( resp, params, range = c(-6, 6), mod = c("brm", "grm"), ... )
wleEst( resp, params, range = c(-6, 6), mod = c("brm", "grm"), ... )
bmeEst( resp, params, range = c(-6, 6), mod = c("brm", "grm"),
        ddist = dnorm, ... )
eapEst( resp, params, range = c(-6, 6), mod = c("brm", "grm"),
        ddist = dnorm, quad = 33, ... )

Arguments

resp

numeric: either a N \times J matrix (where N indicates the number of simulees and J indicates the number of items), a N length vector (if there is only one item) or a J length vector (if there is only one simulee). For the binary response model ("brm"), resp must solely contain 0s and 1s. For the graded response model ("grm"), resp must solely contain integers 1, \ldots, K, where K is the number of categories, as indicated by the dimension of params.

params

numeric: a vector or matrix of item parameters. If specified as a matrix, the rows must index the items, and the columns must designate the item parameters.

range

numeric: a two-element numeric vector indicating the minimum and maximum over which to optimize a likelihood function (mleEst) or posterior distribution (bmeEst), find roots to a score function (wleEst), or integrate over (eapEst).

mod

character: a character string indicating the IRT model. Current support is for the 3-parameter binary response model ("brm"), and Samejima's graded response model ("grm"). See simIrt for more information.

ddist

function: a function that calculates prior densities for Bayesian estimation. For instance, if you wish to specify a normal prior, ddist = dnorm, and if you wish to specify a uniform prior, ddist = dunif. Note that it is standard in R to use d... to indicate a density.

quad

numeric: a scalar indicating the number of quadrature points when using eapEst. See Details for more information.

...

arguments passed to ddist, usually distribution parameters identified by name.

Details

These functions return estimated "ability" for the binary response model ("brm") and the graded response model ("grm"). The only difference between the functions is how they estimate ability.

The function mleEst searches for a maximum of the log-likelihood with respect to each individual \theta_i and uses [T(\theta)]^{-1/2} as the corresponding standard error of measurement (SEM), where T(\theta) is the observed test information function at \theta, as described in FI.

The function bmeEst searches for the maximum of the log-likelihood after a log-prior is added, which effectively maximizes the posterior distribution for each individual \theta_i. The SEM of the bmeEst estimator uses the well known relationship (Keller, 2000, p. 10)

V[\theta | \boldsymbol{u}_i]^{-1} = T(\theta) - \frac{\partial \log[p(\theta)]}{\partial \theta^2}

where V[\theta | \boldsymbol{u}_i] is the variance of \theta after taking into consideration the prior distribution and p(\theta) is the prior distribution of \theta. The function bmeEst estimates the second derivative of the prior distribution uses the hessian function in the numDeriv package.

The function wleEst searches for the root of a modified score function (i.e. the first derivative of the log-likelihood with something added to it). The modification corrects for bias in fixed length tests, and estimation using this modification results in what is called Weighted Maximum Likelihood (or alternatively, the Warm estimator) (see Warm, 1989). So rather than maximizing the likelihood, wleEst finds a root of:

\frac{\partial l(\theta)}{\partial \theta} + \frac{H(\theta)}{2I(\theta)}

where l(\theta) is the log-likelihood of \theta given a set of responses and item parameters, I(\theta) is expected test information to this point, and H(\theta) is a correction constant defined as:

H(\theta) = \sum_j\frac{p_{ij}^{\prime}p_{ij}^{\prime\prime}}{p_{ij}[1 - p_{ij}]}

for the binary response model, where p_{ij}^{\prime} is the first derivative of p_{ij} with respect to \theta, p_{ij}^{\prime\prime} is the second derivative of p_{ij} with respect to \theta, and p_{ij} is the probability of response, as indicated in the help page for simIrt, and

H(\theta) = \sum_j\sum_k\frac{P_{ijk}^{\prime}P_{ijk}^{\prime\prime}}{P_{ijk}}

for the graded response model, where P_{ijk}^{\prime} is the first derivative of P_{ijk} with respect to \theta, P_{ijk}^{\prime\prime} is the second derivative of P_{ijk}, and P_{ijk} is the probability of responding in category k as indicated in the help page for simIrt. The SEM of the wleEst estimator uses an approximation based on Warm (1989, p. 449):

V(\theta) \approx \frac{T(\theta) + \frac{H(\theta)}{2I(\theta)}}{T^2(\theta)}

The function eapEst finds the mean and standard deviation of the posterior distribution given the log-likelihood, a prior distribution (with specified parameters), and the number of quadrature points using the standard Bayesian identity with summations in place of integrations (see Bock and Mislevy, 1982). Rather than using the adaptive, quadrature based integrate, eapEst uses the flexible integrate.xy function in the sfsmisc package. As long as the prior distribution is reasonable (such that the joint distribution is relatively smooth), this method should work.

Value

mleEst, wleEst, bmeEst, and eapEst return a list of the following elements:

theta

an N-length vector of ability values, one for each simulee.

info

an N-length vector of observed test information, one for each simulee. Test information is the sum of item information across items. See FI for more information.

sem

an N-length vector of observed standard error of measurement (or posterior standard deviation) for each simulee. See FI for more information.

Note

For the binary response model ("brm"), it makes no sense to estimate ability with a non-mixed response pattern (all 0s or all 1s). The user might want to include enough items in the model to allow for reasonable estimation.

Weighted likelihood estimation (wleEst) uses uniroot to find the root of the modified score function, so that the end points of range must evaluate to opposite signs (or zero). Rarely, the end points of range will evaluate to the same sign, so that uniroot will error. In these cases, uniroot will extend the interval until the end points of the (modified) range are opposite signs.

Author(s)

Steven W. Nydick swnydick@gmail.com

References

Bock, R. D., & Mislevy, R. J. (1982). Adaptive EAP estimation of ability in a microcomputer environment. Applied Psychological Measurement, 6, 431 – 444.

Embretson, S. E., & Reise, S. P. (2000). Item Response Theory for Psychologists. Mahway, NJ: Lawrence Erlbaum Associates.

Keller (2000). Ability estimation procedures in computerized adaptive testing (Technical Report). New York, NY: American Institute of Certified Public Accountants.

Warm, T. A. (1989). Weighted likelihood estimation of ability in item response theory. Psychometrika, 54, 427 – 450.

van dr Linden, W. J. & Pashley, P. J. (2010). Item selection and ability estimation in adaptive testing. In W. J. van der Linden & C. A. W. Glas (Eds.), Elements of Adaptive Testing. New York, NY: Springer.

See Also

catIrt, simIrt, hessian, uniroot

Examples

## Not run: 

#########################
# Binary Response Model #
#########################
set.seed(888)
# generating random theta:
theta <- rnorm(201)
# generating an item bank under a 2-parameter binary response model:
b.params <- cbind(a = runif(100, .5, 1.5), b = rnorm(100, 0, 2), c = 0)
# simulating responses using specified theta:
b.resp <- simIrt(theta = theta, params = b.params, mod = "brm")$resp


# estimating theta using all four methods:
est.mle1 <- mleEst(resp = b.resp, params = b.params, mod = "brm")$theta
est.wle1 <- wleEst(resp = b.resp, params = b.params, mod = "brm")$theta
est.bme1 <- bmeEst(resp = b.resp, params = b.params, mod = "brm",
                   ddist = dnorm, mean = 0, sd = 1)$theta
est.eap1 <- eapEst(resp = b.resp, params = b.params, mod = "brm",
                   ddist = dnorm, mean = 0, sd = 1, quad = 33)$theta

# eap takes a while!

# all of the methods are highly correlated:
cor(cbind(theta = theta, mle = est.mle1, wle = est.wle1,
                         bme = est.bme1, eap = est.eap1))

# you can force eap to be positive:
est.eap2 <- eapEst(resp = b.resp, params = b.params, range = c(0, 6),
                                  mod = "brm", ddist = dunif, min = 0, max = 6)$theta

est.eap2

# if you only have a single response, MLE will give junk!
mleEst(resp = 0, params = c(1, 0, .2), mod = "brm")$theta

# the others will give you answers that are not really determined by the response:
wleEst(resp = 0, params = c(1, 0, .2), mod = "brm")$theta
bmeEst(resp = 0, params = c(1, 0, .2), mod = "brm")$theta
eapEst(resp = 0, params = c(1, 0, .2), mod = "brm")$theta


#########################
# Graded Response Model #
#########################
set.seed(999)
# generating random theta
theta <- rnorm(400)
# generating an item bank under a graded response model:
g.params <- cbind(a = runif(100, .5, 1.5), b1 = rnorm(100), b2 = rnorm(100),
                                           b3 = rnorm(100), b4 = rnorm(100))
# simulating responses using random theta:
g.mod <- simIrt(params = g.params, theta = theta, mod = "grm")

# pulling out the responses and the parameters:
g.params2 <- g.mod$params[ , -1]       # now the parameters are sorted
g.resp2 <- g.mod$resp

# estimating theta using all four methods:
est.mle3 <- mleEst(resp = g.resp2, params = g.params2, mod = "grm")$theta
est.wle3 <- wleEst(resp = g.resp2, params = g.params2, mod = "grm")$theta
est.bme3 <- bmeEst(resp = g.resp2, params = g.params2, mod = "grm",
                   ddist = dnorm, mean = 0, sd = 1)$theta
est.eap3 <- eapEst(resp = g.resp2, params = g.params2, mod = "grm",
                   ddist = dnorm, mean = 0, sd = 1, quad = 33)$theta

# and the correlations are still pretty high:
cor(cbind(theta = theta, mle = est.mle3, wle = est.wle3,
                         bme = est.bme3, eap = est.eap3))

# note that the graded response model is just a generalization of the brm:
cor(est.mle1, mleEst(resp = b.resp + 1, params = b.params[ , -3], mod = "grm")$theta)
cor(est.wle1, wleEst(resp = b.resp + 1, params = b.params[ , -3], mod = "grm")$theta)
cor(est.bme1, bmeEst(resp = b.resp + 1, params = b.params[ , -3], mod = "grm")$theta)
cor(est.eap1, eapEst(resp = b.resp + 1, params = b.params[ , -3], mod = "grm")$theta)


## End(Not run)


[Package catIrt version 0.5.1 Index]