R: Kullback-Leibler (KL) and posterior Kullback-Leibler (KLP)...

KL {catR}

R Documentation

Kullback-Leibler (KL) and posterior Kullback-Leibler (KLP) values for item selection

Description

This command returns the value of the Kullback-Leibler (KL) or posterior Kullback-Leibler (KLP) for a given item, an item bank and a set of previously administered items.

Usage

KL(itemBank, item, x, it.given, model = NULL, theta = NULL, lower = -4, 
  upper = 4, nqp = 33, type = "KL", priorDist = "norm", priorPar = c(0, 1), 
  D = 1, X = NULL, lik = NULL)

Arguments

`itemBank`	numeric: a suitable matrix of item parameters. See Details.
`item`	numeric: the item (referred to as its rank in the item bank) for which the KL or KLP must be computed.
`x`	numeric: a vector of item responses, coded as 0 or 1 only (for dichotomous items) or from 0 to the number of response categories minus one (for polytomous items).
`it.given`	numeric: a matrix with one row per item and four columns, with the values of the discrimination, the difficulty, the pseudo-guessing and the inattention parameters (in this order). The number of rows of `it` must be equal to the length of `x`.
`model`	either `NULL` (default) for dichotomous models, or any suitable acronym for polytomous models. Possible values are `"GRM"`, `"MGRM"`, `"PCM"`, `"GPCM"`, `"RSM"` and `"NRM"`. See Details.
`theta`	either a numeric value for provisional ability estimate or `NULL`. See Details.
`lower`	numeric: the lower bound for numercal integration (default is -4).
`upper`	numeric: the upper bound for numercal integration (default is 4).
`nqp`	numeric: the number of quadrature points (default is 33).
`type`	character: the type of information to be computed. Possible values are `"KL"` (default) and `"KLP"`. See Details.
`priorDist`	character: the prior ability distribution. Possible values are `"norm"` (default) for the normal distribution, and `"unif"` for the uniform distribution. Ignored if `type` is `"KL"`.
`priorPar`	numeric: a vector of two components with the prior parameters. If `priorDist` is `"norm"`, then `priorPar` contains the mean and the standard deviation of the normal distribution. If `priorDist` is `"unif"`, then `priorPar` contains the bounds of the uniform distribution. The default values are 0 and 1 respectively. Ignored if `type` is `"KL"`.
`D`	numeric: the metric constant. Default is `D=1` (for logistic metric); `D=1.702` yields approximately the normal metric (Haley, 1952).
`X`	either a vector of numeric values or `NULL` (default). See Details.
`lik`	either a vector of numeric values or `NULL` (default). See Details.

Details

Kullback-Leibler information can be used as a rule for selecting the next item in the CAT process (Barrada, Olea, Ponsoda and Abad, 2010; Chang and Ying, 1996), both with dichotomous and polytomous IRT models. This command serves as a subroutine for the nextItem function.

Dichotomous IRT models are considered whenever model is set to NULL (default value). In this case, itemBank must be a matrix with one row per item and four columns, with the values of the discrimination, the difficulty, the pseudo-guessing and the inattention parameters (in this order). These are the parameters of the four-parameter logistic (4PL) model (Barton and Lord, 1981).

Polytomous IRT models are specified by their respective acronym: "GRM" for Graded Response Model, "MGRM" for Modified Graded Response Model, "PCM" for Partical Credit Model, "GPCM" for Generalized Partial Credit Model, "RSM" for Rating Scale Model and "NRM" for Nominal Response Model. The itemBank still holds one row per item, end the number of columns and their content depends on the model. See genPolyMatrix for further information and illustrative examples of suitable polytomous item banks.

Under polytomous IRT models, let k be the number of administered items, and set x_1, ..., x_k as the provisional response pattern (where each response x_l takes values in \{0, 1, ..., g_l\}). Set \hat{\theta}_k as the provisional ability estimate (with the first k responses) and let j be the item of interest (not previously administered). Set also L(\theta | x_1, ..., x_k) as the likelihood function of the first k items and evaluated at \theta. Set finally P_{jt}(\theta) as the probability of answering response category t to item j for a given ability level \theta. Then, Kullack-Leibler (KL) information is defined as

KL_j(\theta || \hat{\theta}_k) = \sum_{t=0}^{g_j} \,P_{jt}(\hat{\theta}_k) \,\log \left( \frac{P_{jt}(\hat{\theta}_k)}{P_{jt}(\theta)}\right).

In case of dichotomous IRT models, all g_l values reduce to 1, so that item responses x_l equal either 0 or 1. Set simply P_j(\theta) as the probability of answering item j correctly for a given ability level \theta. Then, KL information reduces to

KL_j(\theta || \hat{\theta}) = P_j(\hat{\theta}) \,\log \left( \frac{P_j(\hat{\theta}_k)}{P_j(\theta)}\right) + [1-P_j(\hat{\theta}_k)] \,\log \left( \frac{1-P_j(\hat{\theta}_k)}{1-P_j(\theta)}\right).

The quantity that is returned by this KL function is either: the likelihood function weighted by Kullback-Leibler information (the KL value):

KL_j(\hat{\theta}_k) = \int KL_j(\theta || \hat{\theta}_k) \, L(\theta | x_1, ..., x_k) \,d\theta

or the posterior function weighted by Kullback-Leibler information (the KLP value):

KLP_j(\hat{\theta}) = \int KL_j(\theta || \hat{\theta}_k) \, \pi(\theta) \,L(\theta | x_1, ..., x_k) \,d\theta

where \pi(\theta) is the prior distribution of the ability level.

These integrals are approximated by the integrate.catR function. The range of integration is set up by the arguments lower, upper and nqp, giving respectively the lower bound, the upper bound and the number of quadrature points. The default range goes from -4 to 4 with length 33 (that is, by steps of 0.25).

To speed up the computation, both the range of integration of values \theta and the values of the likelihood function L(\theta) can be directly provided to the function through the arguments X and lik. If X is set to NULL (default), the sequence of ability values for integration is determined by the arguments lower, upper and nqp as explained above. If lik is NULL (default), it is also internally computed from an implementation of the likelihood function.

The provisional response pattern and the related item parameters are provided by the arguments x and it.given respectively. The target item (for which the KL information is computed) is given by its rank number in the item bank, through the item argument.

An ability level estimate must be provided to compute KL and KLP information values. Either the value is specified through the theta argument, or it is left equal to NULL (default). In this case, ability estimate is computed internally by maximum likelihood, using the thetaEst function with arguments it.given and x.

Note that the provisional response pattern x can also be set to NULL (which is useful when the number nrItems of starting items is set to zero). In this case, it.given must be a matrix with zero rows, such as e.g., itemBank[NULL,]. In this very specific configuration, the likelihood function L(\theta | x_1, ..., x_k) reduces to the constant value 1 on the whole \theta range (that is, a uniform likelihood).

The argument type defines the type of KL information to be computed. The default value, "KL", computes the usual Kullback-Leibler information, while the posterior Kullback-Leibler value is obtained with type="KLP". For the latter, the priorDist and priorPar arguments fix the prior ability distribution. The normal distribution is set up by priorDist="norm" and then, priorPar contains the mean and the standard deviation of the normal distribution. If priorDist is "unif", then the uniform distribution is considered, and priorPar fixes the lower and upper bounds of that uniform distribution. By default, the standard normal prior distribution is assumed. These arguments are ignored whenever method is "KL".

Value

The required KL or KLP value for the selected item.

Author(s)

David Magis
Department of Psychology, University of Liege, Belgium
david.magis@uliege.be

Juan Ramon Barrada
Department of Psychology and Sociology, Universidad Zaragoza, Spain
barrada@unizar.es

References

Barrada, J. R., Olea, J., Ponsoda, V., and Abad, F. J. (2010). A method for the comparison of item selection rules in computerized adaptive testing. Applied Psychological Measurement, 20, 213-229. doi: 10.1177/0146621610370152

Barton, M.A., and Lord, F.M. (1981). An upper asymptote for the three-parameter logistic item-response model. Research Bulletin 81-20. Princeton, NJ: Educational Testing Service.

Chang, H.-H., and Ying, Z. (1996). A global information approach to computerized adaptive testing. Applied Psychological Measurement, 34, 438-452. doi: 10.1177/014662169602000303

Haley, D.C. (1952). Estimation of the dosage mortality relationship when the dose is subject to error. Technical report no 15. Palo Alto, CA: Applied Mathematics and Statistics Laboratory, Stanford University.

Magis, D. and Barrada, J. R. (2017). Computerized Adaptive Testing with R: Recent Updates of the Package catR. Journal of Statistical Software, Code Snippets, 76(1), 1-18. doi: 10.18637/jss.v076.c01

Magis, D., and Raiche, G. (2012). Random Generation of Response Patterns under Computerized Adaptive Testing with the R Package catR. Journal of Statistical Software, 48 (8), 1-31. doi: 10.18637/jss.v048.i08

Examples


## Dichotomous models ##

 # Loading the 'tcals' parameters 
 data(tcals)

 # Selecting item parameters only
 bank <- as.matrix(tcals[,1:4])
 
 # Selection of two arbitrary items (15 and 20) of the
 # 'tcals' data set
 it.given <- bank[c(15, 20),]

 # Creation of a response pattern
 x <- c(0, 1)

 # KL for item 1, ML estimate of ability computed
 KL(bank, 1, x, it.given)

 # Current (ML) ability estimate 
 theta <- thetaEst(it.given, x, method = "ML")
 KL(bank, 1, x, it.given, theta = theta)

 # WL ability estimate instead
 theta <- thetaEst(it.given, x, method = "WL")
 KL(bank, 1, x, it.given, theta = theta)

 # KLP for item 1
 KL(bank, 1, x, it.given, theta = theta, type = "KLP")

 # KLP for item 1, different integration range
 KL(bank, 1, x, it.given, theta = theta, type = "KLP", lower = -2, upper = 2, nqp = 20)

 # KL for item 1, uniform prior distribution on the range [-2,2]
 KL(bank, 1, x, it.given, theta = theta, type = "KLP", priorDist = "unif", 
    priorPar = c(-2, 2))

 # Computation of likelihood function beforehand
 L <- function(th, r, param) 
  prod(Pi(th, param)$Pi^r * (1 - Pi(th,param)$Pi)^(1 - r))
 xx <- seq(from = -4, to = 4, length = 33)
 y <- sapply(xx, L, x, it.given) 
 KL(bank, 1, x, it.given, theta = theta, X = xx, lik = y)


## Polytomous models ##

 # Generation of an item bank under GRM with 100 items and at most 4 categories
 m.GRM <- genPolyMatrix(100, 4, "GRM")
 m.GRM <- as.matrix(m.GRM)

 # Selection of two arbitrary items (15 and 20) 
 it.given <- m.GRM[c(15, 20),]

 # Generation of a response pattern (true ability level 0)
 x <- genPattern(0, it.given, model = "GRM")

 # KL for item 1, ML estimate of ability computed
 KL(m.GRM, 1, x, it.given, model = "GRM")

 # Current (ML) ability estimate 
 theta <- thetaEst(it.given, x, method = "ML", model = "GRM")
 KL(m.GRM, 1, x, it.given, theta = theta, model = "GRM")

 # WL ability estimate instead
 theta <- thetaEst(it.given, x, method = "WL", model = "GRM")
 KL(m.GRM, 1, x, it.given, theta = theta, model = "GRM")

 # KLP for item 1
 KL(m.GRM, 1, x, it.given, theta = theta, model = "GRM", type = "KLP")

 # KLP for item 1, different integration range
 KL(m.GRM, 1, x, it.given, theta = theta, model = "GRM", type = "KLP", lower = -2, 
    upper = 2, nqp = 20)

 # KL for item 1, uniform prior distribution on the range [-2,2]
 KL(m.GRM, 1, x, it.given, theta = theta, model = "GRM", type = "KLP", 
    priorDist = "unif", priorPar = c(-2, 2))


 # Loading the cat_pav data
 data(cat_pav)
 cat_pav <- as.matrix(cat_pav)

 # Selection of two arbitrary items (15 and 20) 
 it.given <- cat_pav[c(15, 20),]

 # Generation of a response pattern (true ability level 0)
 x <- genPattern(0, it.given, model = "GPCM")

  # KL for item 1, ML estimate of ability computed
 KL(cat_pav, 1, x, it.given, model = "GPCM")

 # Current (ML) ability estimate 
 theta <- thetaEst(it.given, x, method = "ML", model = "GPCM")
 KL(cat_pav, 1, x, it.given, theta = theta, model = "GPCM")

 # WL ability estimate instead
 theta <- thetaEst(it.given, x, method = "WL", model = "GPCM")
 KL(cat_pav, 1, x, it.given, theta = theta, model = "GPCM")

 # KLP for item 1
 KL(cat_pav, 1, x, it.given, theta = theta, model = "GPCM", type = "KLP")

 # KLP for item 1, different integration range
 KL(cat_pav, 1, x, it.given, theta = theta, model = "GPCM", type = "KLP", lower = -2, 
    upper = 2, nqp = 20)

 # KL for item 1, uniform prior distribution on the range [-2,2]
 KL(cat_pav, 1, x, it.given, theta = theta, model = "GPCM", type = "KLP", 
    priorDist = "unif", priorPar = c(-2, 2))

[Package catR version 3.17 Index]