semTheta {catR}R Documentation

Standard error of ability estimation (dichotomous and polytomous models)

Description

This command returns the estimated standard error of the ability estimate, for a given response pattern and a given matrix of item parameters, either under the 4PL model or any suitable polytomous IRT model. Exact standard errors are available for dichotomous models.

Usage

semTheta(thEst, it, x = NULL, model = NULL, D = 1, method = "BM", 
  	priorDist = "norm", priorPar = c(0, 1), weight = "Huber", tuCo = 1, 
  	sem.type = "classic", parInt = c(-4, 4, 33), constantPatt = NULL,
  	sem.exact = FALSE, trueTh = NULL, range = c(-4, 4))
 

Arguments

thEst

numeric: the ability estimate.

it

numeric: a suitable matrix of item parameters. See Details.

x

numeric: a vector of item responses (default is NULL). Can also hold missing data (coded as NAs). Ignored if method is not "EAP" but must be provided in presence of missing responses. See Details.

model

either NULL (default) for dichotomous models, or any suitable acronym for polytomous models. Possible values are "GRM", "MGRM", "PCM", "GPCM", "RSM" and "NRM". See Details.

D

numeric: the metric constant. Default is D=1 (for logistic metric); D=1.702 yields approximately the normal metric (Haley, 1952).

method

character: the ability estimator. Possible values are "BM" (default), "ML", "WL", "EAP" and "ROB". See Details.

priorDist

character: specifies the prior distribution. Possible values are "norm" (default), "unif" and "Jeffreys". Ignored if method is neither "BM" nor "EAP". See Details.

priorPar

numeric: vector of two components specifying the prior parameters (default is c(0,1)) of the prior ability distribution. Ignored if method is neither "BM" nor "EAP", or if priorDist="Jeffreys". See Details.

weight

character: the type of weight function for the robust estimator. Possible values are "Huber" (default) and "Tukey". Ignored if method is not "ROB" or if model is not NULL. See Details.

tuCo

numeric: the value of the tuning constant for the weight function (default is 1, suitable with "Huber" weight). Ignored if method is not "ROB" or if model is not NULL. See Details.

sem.type

character: the type of ASE formula to be used, either "classic" (default) or "new". Ignored if method is neither "BM" nor "WL", or if model is not NULL. See Details.

parInt

numeric: vector of three components, holding respectively the values of the arguments lower, upper and nqp of the eapEst command. Default vector is (-4, 4, 33). Ignored if method is not "EAP".

constantPatt

character: the method to estimate ability in case of constant pattern (i.e. only correct or only incorrect responses). Can be eitehr NULL (default), "BM", "EAP", "WL", "fixed4", "fixed7" or "var". Currently only implemented for dichotomous IRT models. See Details.

sem.exact

logical: should exact standard error be computed instead of asymptotic standard error? (default is FALSE). Ignored if model is not NULL. See Details.

trueTh

(For simulation study purposes only) either NULL (default) or the true ability level of interest. Ignored if sem.exact is FALSE or if model is not NULL. See Details.

range

numeric: vector of two components specifying the range wherein the ability estimate must be looked for (default is c(-4,4)). Used only to compute exact standard errors. Ignored if sem.exact is FALSE or if model is not NULL.

Details

Dichotomous IRT models are considered whenever model is set to NULL (default value). In this case, it must be a matrix with one row per item and four columns, with the values of the discrimination, the difficulty, the pseudo-guessing and the inattention parameters (in this order). These are the parameters of the four-parameter logistic (4PL) model (Barton and Lord, 1981).

Polytomous IRT models are specified by their respective acronym: "GRM" for Graded Response Model, "MGRM" for Modified Graded Response Model, "PCM" for Partical Credit Model, "GPCM" for Generalized Partial Credit Model, "RSM" for Rating Scale Model and "NRM" for Nominal Response Model. The it still holds one row per item, end the number of columns and their content depends on the model. See genPolyMatrix for further information and illustrative examples of suitable polytomous item banks.

The vector of response patterns x can also hold missing responses (more useful in linear testing, not in CAT). In this case the missing responses must be coded as NA values. They are discarded from the ability estimation process. Note that in presence of missing responses, the pattern x must be provided indepedently of the estimation method (to discard items with missing responses from the computation).

Five ability estimators are available: the maximum likelihood (ML) estimator (Lord, 1980), the Bayes modal (BM) estimator (Birnbaum, 1969), the expected a posteriori (EAP) estimator (Bock and Mislevy, 1982), the weighted likelihood (WL) estimator (Warm, 1989) and the robust estimator (Schuster & Yuan, 2011). The selected estimator is specified by the method argument, with values "ML", "BM", "EAP", "WL" and "ROB" respectively.

For the BM and EAP estimators, three prior distributions are available: the normal distribution, the uniform distribution and the Jeffreys' prior distribution (Jeffreys, 1939, 1946). The prior distribution is specified by the argument priorPar, with values "norm", "unif" and "Jeffreys", respectively. The priorPar argument is ignored if method="ML" or method="WL".

The argument priorPar determines either: the prior mean and standard deviation of the normal prior distribution (if priorDist="norm"), or the range for defining the prior uniform distribution (if priorDist="unif"). This argument is ignored if priorDist="Jeffreys".

The eapPar argument sets the range and the number of quadrature points for numerical integration in the EAP process. By default, it takes the vector value (-4, 4, 33), that is, 33 quadrature points on the range [-4; 4] (or, by steps of 0.25). See eapEst for further details.

Robust estimation requires an appropriate weight function that depends on an accurate tuning constant. Suggested functions are the Huber weight (Schuester and Yuan, 2011) and the Tukey weight (Mosteller and Tukey, 1977). Both can be set by the weight argument, with respective values "Huber" and "Tukey". Default function is Huber. Moreover, the tuCo argument specifies the tuning constant for the weight function. Default value is 1 and suggested for Huber weight (also by default), and value 4 is suggested for Tukey weight (Schuester and Yuan, 2011).

New ASE formulas, proposed by Magis (2016), are now available. They can be supplied by the sem.type argument, which takes the default value "classic" (so usual ASEs are computed) or "new" (for the newly suggested formulas). Note that new ASEs are available only for BM and WL estimators, as for other estimators the classic and new versions are identical. Note also that these new ASE formulas are available for dichotomous IRT models only.

Note that in the current version, the ability estimate must be specified through the thEst argument. Moreover, the response pattern must be specified through the x argument to compute the standard error of the EAP estimate. For the other estimation methods, this is not necessary, and x is set to NULL by default for this purpose.

Note also that if specific stepsize adjustment was required for constant patterns with the constantPatt argument (that is, if it takes value "fixed4", "fixed7" or "var") then an infinite value Inf is being returned.

Finally, exact standard errors can be computed with dichotomous IRT models only (Magis, 2014). This is requested by setting argument sem.exact to TRUE (default is FALSE so regular, asymptotic formula is considered). The exact standard error is computed using the full ability distribution provided by the link{fullDist} function. Two additional arguments can be set up: range to define the range of estimated ability levels during the derivation of the full distribution, and trueTh that can be used to specify the true ability level in addition to the estimated one (through thEst argument). The latter argument is for simulation study purposes, as the set of all observable ability estimates in fullDist must then be computed only once. If trueTh is provided some real value, then two SEs are returned: the exact SE obtained with the estimated thEst level, and the true SE derived as the exact SE with true trueTh level.

Important: The computation of the exact standard error is very efficient under the Rasch (1PL) model due to the ue of the Lord-Wingersky algorithm (Lord and Wingersky, 1984). Therefore, any test length can be considered for computing the exact SE under this model. But for other dichotomous IRT models, the computatinal effort becomes very demanding even with a limited set of n items, since 2^n patterns must be generated and ability estimated with each such pattern. For this reason, it is recommended not to consider exact SE with models other than the Rasch (1PL) with more than 10 items.

Value

The estimated standard error of the ability level (or Inf if the response pattern is constant and constantPatt is not NULL).

If exact standard error is computed (sem.exact = TRUE) and trueTh argument takes some numeric value, then output is a vector of two components with the two exact standard errors, the first SE corresponding to the thEst value and the second SE to the trueTh value.

Note

The classic asymptotic standard error of the WL estimator is computed with the same formula as that of the ML estimator (up to the plug-in of the WL estimate instead of the ML estimate). Note however that versions of catR prior to 3.0 hold a different formula mentioned in Magis and raiche (2012), but it appeared that this formula can lead to negative values of the square of the standard error. So the usual suggestion by Warm (1989) of using the same asymptotic formulas for ML and WL is the currently used formula for classic asymptotic computation of the standard error.

Author(s)

David Magis
Department of Psychology, University of Liege, Belgium
david.magis@uliege.be

Lianne Ippel
Department of Psychology, University of Liege, Belgium
g.j.e.ippel@gmail.com

References

Barton, M.A., and Lord, F.M. (1981). An upper asymptote for the three-parameter logistic item-response model. Research Bulletin 81-20. Princeton, NJ: Educational Testing Service.

Birnbaum, A. (1969). Statistical theory for logistic mental test models with a prior distribution of ability. Journal of Mathematical Psychology, 6, 258-276. doi: 10.1016/0022-2496(69)90005-4

Bock, R. D., and Mislevy, R. J. (1982). Adaptive EAP estimation of ability in a microcomputer environment. Applied Psychological Measurement, 6, 431-444. doi: 10.1177/014662168200600405

Dodd, B. G., De Ayala, R. J., and Koch, W. R. (1995). Computerized adaptive testing with polytomous items. Applied Psychological Measurement, 19, 5-22. doi: 10.1177/014662169501900103

Haley, D.C. (1952). Estimation of the dosage mortality relationship when the dose is subject to error. Technical report no 15. Palo Alto, CA: Applied Mathematics and Statistics Laboratory, Stanford University.

Jeffreys, H. (1939). Theory of probability. Oxford, UK: Oxford University Press.

Jeffreys, H. (1946). An invariant form for the prior probability in estimation problems. Proceedings of the Royal Society of London. Series A, Mathematical and Physical Sciences, 186, 453-461.

Lord, F.M. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Lawrence Erlbaum.

Lord, F. M., and Wingersky, M. S. (1984). Comparison of IRT true-score and equipercentile observed-score equatings. Applied Psychological Measurement, 8, 453-461. doi: 10.1177/014662168400800409

Magis, D. (2014). Accuracy of asymptotic standard errors of the maximum and weighted likelihood estimators of proficiency levels with short tests. Applied Psychological Measurement, 38, 105-121. doi: 10.1177/0146621613496890

Magis, D. (2016). Efficient standard errors formulas of ability estimators with dichotomous item response models. Psychometrika, 81, 184-200. doi: 10.1007/s11336-015-9443-3

Magis, D. and Barrada, J. R. (2017). Computerized Adaptive Testing with R: Recent Updates of the Package catR. Journal of Statistical Software, Code Snippets, 76(1), 1-18. doi: 10.18637/jss.v076.c01

Magis, D., and Raiche, G. (2012). Random Generation of Response Patterns under Computerized Adaptive Testing with the R Package catR. Journal of Statistical Software, 48 (8), 1-31. doi: 10.18637/jss.v048.i08

Mosteller, F., and Tukey, J. (1977). Exploratory data analysis and regression. Reading, MA: Addison-Wesley.

Schuester, C., and Yuan, K.-H. (2011). Robust estimation of latent ability in item response models. Journal of Educational and Behavioral Statistics, 36, 720)735.doi: 10.3102/1076998610396890

Warm, T.A. (1989). Weighted likelihood estimation of ability in item response models. Psychometrika, 54, 427-450. doi: 10.1007/BF02294627

See Also

eapSem, thetaEst, genPolyMatrix, fullDist

Examples


## Dichotomous models ##

 # Loading the 'tcals' parameters 
 data(tcals)
 
 # Selecting item parameters only
 tcals <- as.matrix(tcals[,1:4])

 # Creation of a response pattern (tcals item parameters, true ability level 0)
 x <- genPattern(0, tcals, seed = 1)

 # ML estimation
 th <- thetaEst(tcals, x, method = "ML")
 c(th, semTheta(th, tcals, method = "ML"))

 # ML estimation, new ASE formula (=> yields the same result)
 c(th, semTheta(th, tcals, method = "ML", sem.type = "new"))

 # With first two responses missing
 x.mis <- x
 x.mis[1:2] <- NA
 th <- thetaEst(tcals, x.mis, method = "ML")
 c(th, semTheta(th, tcals, x.mis, method = "ML"))

 # BM estimation, standard normal prior distribution
 th <- thetaEst(tcals, x)
 c(th, semTheta(th, tcals))

 # BM estimation and new ASE formula
 c(th, semTheta(th, tcals, sem.type = "new"))

 # BM estimation, uniform prior distribution upon range [-2,2]
 th <- thetaEst(tcals, x, method = "BM", priorDist = "unif",
                priorPar = c(-2, 2))
 c(th, semTheta(th, tcals, method = "BM", priorDist = "unif",
		    priorPar = c(-2, 2)))

 # BM estimation, Jeffreys' prior distribution  
 th <- thetaEst(tcals, x, method = "BM", priorDist = "Jeffreys")
 c(th, semTheta(th, tcals, method = "BM", priorDist = "Jeffreys"))

 # EAP estimation, standard normal prior distribution
 th <- thetaEst(tcals, x, method = "EAP")
 c(th, semTheta(th, tcals, x, method = "EAP"))

## Not run: 

 # EAP estimation, uniform prior distribution upon range [-2,2]
 th <- thetaEst(tcals, x, method = "EAP", priorDist = "unif",
                priorPar = c(-2, 2))
 c(th, semTheta(th, tcals, x, method = "EAP", priorDist = "unif",
		    priorPar = c(-2, 2)))

 # EAP estimation, Jeffreys' prior distribution  
 th <- thetaEst(tcals, x, method = "EAP", priorDist = "Jeffreys")
 c(th, semTheta(th, tcals, x, method = "EAP", priorDist = "Jeffreys"))

 # WL estimation
 th <- thetaEst(tcals, x, method = "WL")
 c(th, semTheta(th, tcals, method = "WL"))

 # WL estimation, new ASE formula
 c(th, semTheta(th, tcals, method = "WL", sem.type = "new"))

 # 'fixed4' adjustment for constant pattern
 th <- thetaEst(tcals, rep(0, nrow(tcals)), constantPatt = "fixed4")
 c(th, semTheta(th, tcals, constantPatt = "fixed4"))

 # Robust estimation
 th <- thetaEst(tcals, x, method = "ROB")
 c(th, semTheta(th, tcals, method = "ROB"))

 # Robust estimation, Huber weight and tuning constant 2
 th <- thetaEst(tcals, x, method = "ROB", tuCo = 2)
 c(th, semTheta(th, tcals, method = "ROB", tuCo = 2))

 # Robust estimation, Tukey weight and tuning constant 4
 th <- thetaEst(tcals, x, method = "ROB", weight = "Tukey", tuCo = 4)
 c(th, semTheta(th, tcals, method = "ROB", weight = "Tukey", tuCo = 4))

 ## Exact SE computation under 1PL model: 
 # Creation of a 1PL item bank with difficulties from 'tcals' (85 items)
 tcals2 <- cbind(1, tcals[, 2],  0, 1)

 # Pattern generation for true ability level 1
 x2 <- genPattern(1, tcals2, seed = 1)

 # ML estimation
 th2 <- thetaEst(tcals2, x2, method = "ML")
 c(th2, semTheta(th2, tcals2, x2, method = "ML", sem.exact = TRUE))

 # ML estimation, true SE in addition
 c(th2, semTheta(th2, tcals2, x2, method = "ML", sem.exact = TRUE, trueTh = 1))

 ## Exact SE computation under 2PL model: 
 # Creation of a 2PL item bank with ten items
 it <- genDichoMatrix(10, model = "2PL", seed = 1)

 # Pattern generation for true ability level 1
 x3 <- genPattern(1, it, seed = 1)

 # ML estimation
 th3 <- thetaEst(it, x3, method = "ML")
 c(th3, semTheta(th3, it, x3, method = "ML", sem.exact = TRUE))

 # ML estimation, true SE in addition
 c(th3, semTheta(th3, it, x3, method = "ML", sem.exact = TRUE, trueTh = 1))

## End(Not run)

## Polytomous models ##

 # Generation of an item bank under GRM with 100 items and at most 4 categories
 m.GRM <- genPolyMatrix(100, 4, "GRM")
 m.GRM <- as.matrix(m.GRM)

 # Creation of a response pattern (true ability level 0)
 set.seed(1)
 x <- genPattern(0, m.GRM, model = "GRM")

# ML estimation
 th <- thetaEst(m.GRM, x, model = "GRM", method = "ML")
 c(th, semTheta(th, m.GRM, model = "GRM", method = "ML"))

 # BM estimation, standard normal prior distribution
 th <- thetaEst(m.GRM, x, model = "GRM")
 c(th, semTheta(th, m.GRM, model = "GRM"))

 # BM estimation, uniform prior distribution upon range [-2,2]
 th <- thetaEst(m.GRM, x, model = "GRM", method = "BM", priorDist = "unif", 
    priorPar = c(-2, 2))
 c(th, semTheta(th, m.GRM, model = "GRM", method = "BM", priorDist = "unif", 
  priorPar = c(-2, 2)))

 # BM estimation, Jeffreys' prior distribution  
 th <- thetaEst(m.GRM, x, model = "GRM", method = "BM", priorDist = "Jeffreys")
 c(th, semTheta(th, m.GRM, model = "GRM", method = "BM", priorDist = "Jeffreys"))

 # EAP estimation, standard normal prior distribution
 th <- thetaEst(m.GRM, x, model = "GRM", method = "EAP")
 c(th, semTheta(th, m.GRM, x, model = "GRM", method = "EAP") )

## Not run: 

 # EAP estimation, uniform prior distribution upon range [-2,2]
 th <- thetaEst(m.GRM, x, model = "GRM", method = "EAP", priorDist = "unif", 
    priorPar = c(-2, 2))
 c(th, semTheta(th, m.GRM, x, model = "GRM", method = "EAP", priorDist = "unif", 
  priorPar = c(-2, 2)))

 # EAP estimation, Jeffreys' prior distribution  
 th <- thetaEst(m.GRM, x, model = "GRM", method = "EAP", priorDist = "Jeffreys")
 c(th, semTheta(th, m.GRM, x, model = "GRM", method = "EAP", priorDist = "Jeffreys"))

 # WL estimation
 th <- thetaEst(m.GRM, x, model = "GRM", method = "WL")
 c(th, semTheta(th, m.GRM, model = "GRM", method = "WL"))


 # Loading the cat_pav data
 data(cat_pav)
 cat_pav <- as.matrix(cat_pav)

 # Creation of a response pattern (true ability level 0)
 set.seed(1)
 x <- genPattern(0, cat_pav, model = "GPCM")

# ML estimation
 th <- thetaEst(cat_pav, x, model = "GPCM", method = "ML")
 c(th, semTheta(th, cat_pav, model = "GPCM", method = "ML"))

 # BM estimation, standard normal prior distribution
 th <- thetaEst(cat_pav, x, model = "GPCM")
 c(th, semTheta(th, cat_pav, model = "GPCM"))

 # BM estimation, uniform prior distribution upon range [-2,2]
 th <- thetaEst(cat_pav, x, model = "GPCM", method = "BM", priorDist = "unif", 
    priorPar = c(-2, 2))
 c(th, semTheta(th, cat_pav, model = "GPCM", method = "BM", priorDist = "unif", 
  priorPar = c(-2, 2)))

 # BM estimation, Jeffreys' prior distribution  
 th <- thetaEst(cat_pav, x, model = "GPCM", method = "BM", priorDist = "Jeffreys")
 c(th, semTheta(th, cat_pav, model = "GPCM", method = "BM", priorDist = "Jeffreys"))

 # EAP estimation, standard normal prior distribution
 th <- thetaEst(cat_pav, x, model = "GPCM", method = "EAP")
 c(th, semTheta(th, cat_pav, x, model = "GPCM", method = "EAP"))

 # EAP estimation, uniform prior distribution upon range [-2,2]
 th <- thetaEst(cat_pav, x, model = "GPCM", method = "EAP", priorDist = "unif", 
    priorPar = c(-2, 2))
 c(th, semTheta(th, cat_pav, x, model = "GPCM", method = "EAP", priorDist = "unif", 
  priorPar = c(-2, 2)))

 # EAP estimation, Jeffreys' prior distribution  
 th <- thetaEst(cat_pav, x, model = "GPCM", method = "EAP", priorDist = "Jeffreys")
 c(th, semTheta(th, cat_pav, x, model = "GPCM", method = "EAP", priorDist = "Jeffreys"))

 # WL estimation
 th <- thetaEst(cat_pav, x, model = "GPCM", method = "WL")
 c(th, semTheta(th, cat_pav, model = "GPCM", method = "WL"))

## End(Not run)
 

[Package catR version 3.17 Index]