cdcat {cdcatR}R Documentation

Cognitively based computerized adaptive test application

Description

cdcat conducts a CD-CAT application for a given dataset. Different item selection rules can be used: the general discrimination index (GDI; de la Torre & Chiu, 2016; Kaplan et al., 2015), the Jensen-Shannon divergence index (JSD; Kang et al., 2017; Minchen & de la Torre, 2016; Yigit et al., 2018), the posterior-weighted Kullback-Leibler index (PWKL; Cheng, 2009), the modified PWKL index (MPWKL; Kaplan et al., 2015), the nonparametric item selection method (NPS; Chang et al., 2019), the general nonparametric item selection method (GNPS; Chiu & Chang, 2021), or random selection. Fixed length or fixed precision CD-CAT can be applied. Fixed precision CD-CAT with NPS and GNPS is available, by using the pseudo-posterior probability of each student mastering each attribute (experimental).

Usage

cdcat(
  fit = NULL,
  dat = NULL,
  itemSelect = "GDI",
  MAXJ = 20,
  FIXED.LENGTH = TRUE,
  startRule = "random",
  startK = FALSE,
  att.prior = NULL,
  initial.distr = NULL,
  precision.cut = 0.8,
  NP.args = list(Q = NULL, gate = NULL, PPP = TRUE, w = 2),
  itemExposurecontrol = NULL,
  b = 0,
  maxr = 1,
  itemConstraint = NULL,
  constraint.args = list(ATTRIBUTEc = NULL),
  n.cores = 2,
  seed = NULL,
  print.progress = TRUE
)

Arguments

fit

An object of class GDINA, gdina (parametric CD-CAT), or GNPC (non-parametric CD-CAT based on GNPS). Calibrated item bank with the GDINA::GDINA (Ma & de la Torre, 2020), CDM::gdina (Robitzsch et al., 2020), or cdmTools::GNPC (Najera et al., 2022) R packages functions

dat

Numeric matrix of dimensions N number of examinees x J number of items. Dataset to be analyzed. If is.null(dat) the data is taken data from the fit object (i.e., the calibration sample is used)

itemSelect

Scalar character. Item selection rule: GDI, JSD, MPWKL, PWKL, NPS, GNPS, or random

MAXJ

Scalar numeric. Maximum number of items to be applied regardless of the FIXED.LENGTH argument. Default is 20

FIXED.LENGTH

Scalar logical. Fixed CAT-length (TRUE) or fixed-precision (FALSE) application. Default is TRUE

startRule

Scalar character. Starting rule: first item is selected at random with random and first item is selected using itemSelect with max. Default is random. Seed for random is NPS.args$seed

startK

Scalar logical. Start the CAT with an identity matrix (TRUE) or not proceed with startRule from the first item (FALSE). Default is FALSE

att.prior

Numeric vector of length 2^K, where K is the number of attributes. Prior distribution for MAP/EAP estimates. Default is uniform

initial.distr

Numeric vector of length 2^K, where K is the number of attributes. Weighting distribution to initialize itemSelect at item position 1. Default is uniform

precision.cut

Scalar numeric. Cutoff for fixed-precision (assigned pattern posterior probability > precision.cut; Hsu, Wang, & Chen, 2013). When itemSelect = "NPS" this is evaluated at the attribute level using the pseudo-posterior probabilities for each attribute (K assigned attribute pseudo-posterior probability > precision.cut). Default is .80. A higher cutoff is recommended when itemSelect = "NPS"

NP.args

A list of options when itemSelect = "NPS" or "GNPS". Q = Q-matrix to be used in the analysis. gate = "AND" or "OR", depending on whether a conjunctive o disjunctive nonparametric CDM is used. PPP = pseudo-posterior probability of each examinee mastering each attribute (experimental). w = weight type used for computing the pseudo-posterior probability (experimental)

itemExposurecontrol

Scalar character. Item exposure control: NULL or progressive method (Barrada, Olea, Ponsoda, & Abad, 2008) with "progressive". Default is NULL. Seed for the random component is NPS.args$seed

b

Scalar numeric. Acceleration parameter for the item exposure method. Only applies if itemExposurecontrol = "progressive". In the progressive method the first item is selected at random and the last item (i.e., MAXJ) is selected purely based on itemSelect. The rest of the items are selected combining both a random and information components. The loss of importance of the random component will be linear with b = 0, inverse exponential with b < 0, or exponential with b > 0. Thus, b allows to optimize accuracy (b < 0) or item security (b > 0). Default is 0

maxr

Scalar numeric. Value should be in the range 0-1. Maximum item exposure rate that is tolerated. Default is 1. Note that for maxr < 1 parallel computing cannot be implemented

itemConstraint

Scalar character. Constraints that must be satisfied by the set of items applied: NULL or attribute constraint (Henson & Douglas, 2005) with "attribute". If "attribute" is chosen, then each attribute must be measured at least a specific number of times indicated in the constraint.args$ATTRIBUTEc argument. Default is NULL

constraint.args

A list of options when itemConstraint != "NULL". At the moment it only includes the argument ATTRIBUTEc which must be a numeric vector of length ncol(Q) indicating the minimum number of items per attribute to be administered. Default is 3

n.cores

Scalar numeric. Number of cores to be used during parallelization. Default is 2

seed

Numeric vector of length 1. Some methods have a random component, so a seed is required for consistent results

print.progress

Scalar logical. Prints a progress bar to the console. Default is TRUE

Value

cdcat returns an object of class cdcat.

est

A list that contains for each examinee the mastery posterior probability estimates at each step of the CAT (est.cat) and the items applied (item.usage)

specifications

A list that contains all the specifications

References

Barrada, J. R., Olea, J., Ponsoda, V., & Abad, F. J. (2008). Incorporating randomness in the Fisher information for improving item-exposure control in CATs.British Journal of Mathematical and Statistical Psychology, 61, 493-513.

Chang, Y.-P., Chiu, C.-Y., & Tsai, R.-C. (2019). Nonparametric CAT for CD in educational settings with small samples. Applied Psychological Measurement, 43, 543-561.

Cheng, Y. (2009). When cognitive diagnosis meets computerized adaptive testing: CD-CAT. Psychometrika, 74, 619-632.

Chiu, C. Y., & Chang, Y. P. (2021). Advances in CD-CAT: The general nonparametric item selection method. Psychometrika, 86, 1039-1057.

de la Torre, J., & Chiu, C. Y. (2016). General method of empirical Q-matrix validation. Psychometrika, 81, 253-273.

George, A. C., Robitzsch, A., Kiefer, T., Gross, J., & Uenlue, A. (2016). The R Package CDM for cognitive diagnosis models. Journal of Statistical Software, 74, 1-24. doi:10.18637/jss.v074.i02

Henson, R., & Douglas, J. (2005). Test construction for cognitive diagnosis. Applied Psychological Measurement, 29, 262-277.

Hsu, C. L., Wang, W. C., & Chen, S. Y. (2013). Variable-length computerized adaptive testing based on cognitive diagnosis models. Applied Psychological Measurement, 37, 563-582.

Kang, H.-A., Zhang, S., & Chang, H.-H. (2017). Dual-objective item selection criteria in cognitive diagnostic computerized adaptive testing. Journal of Educational Measurement, 54, 165-183.

Kaplan, M., de la Torre, J., & Barrada, J. R. (2015). New item selection methods for cognitive diagnosis computerized adaptive testing. Applied Psychological Measurement, 39, 167-188.

Ma, W. & de la Torre, J. (2020). GDINA: The generalized DINA model framework. R package version 2.7.9. Retrived from https://CRAN.R-project.org/package=GDINA

Minchen, N., & de la Torre, J. (2016, July). The continuous G-DINA model and the Jensen-Shannon divergence. Paper presented at the International Meeting of the Psychometric Society, Asheville, NC, United States.

Nájera, P., Sorrel, M. A., & Abad, F. J. (2022). cdmTools: Useful Tools for Cognitive Diagnosis Modeling. R package version 1.0.1. https://CRAN.R-project.org/package=cdmTools

Robitzsch, A., Kiefer, T., George, A. C., & Uenlue, A. (2020). CDM: Cognitive Diagnosis Modeling. R package version 7.5-15. https://CRAN.R-project.org/package=CDM

Yigit, H. D., Sorrel, M. A., de la Torre, J. (2018). Computerized adaptive testing for cognitively based multiple-choice data. Applied Psychological Measurement, 43, 388-401.

Examples



######################################
# Example 1.                         #
# CD-CAT simulation for a GDINA obj  #
######################################

#-----------Data----------#
Q <- sim180GDINA$simQ
K <- ncol(Q)
dat <- sim180GDINA$simdat
att <- sim180GDINA$simalpha

#----------Model estimation----------#
fit <- GDINA::GDINA(dat = dat, Q = Q, verbose = 0) # GDINA package
#fit <- CDM::gdina(data = dat, q.matrix = Q, progress = 0) # CDM package

#---------------CD-CAT---------------#
res.FIXJ <- cdcat(fit = fit, dat = dat, FIXED.LENGTH = TRUE,
                 MAXJ = 20, n.cores = 2)
res.VARJ <- cdcat(fit = fit, dat = dat, FIXED.LENGTH = FALSE,
                 MAXJ = 20, precision.cut = .80, n.cores = 2)

#---------------Results--------------#
res.FIXJ$est[[1]] # estimates for the first examinee (fixed-length)
res.VARJ$est[[1]] # estimates for the first examinee (fixed-precision)
att.plot(cdcat.obj = res.FIXJ, i = 1) # plot for the first examinee (fixed-length)
att.plot(cdcat.obj = res.VARJ, i = 1) # plot  for the first examinee (fixed-precision)
# FIXJ summary
res.FIXJ.sum.real <- cdcat.summary(cdcat.obj = res.FIXJ, alpha = att) # vs. real accuracy
res.FIXJ.sum.real$alpha.recovery$PCV.plot
res.FIXJ.sum.real$item.exposure$exp.plot
# VARJ summary
res.VARJ.sum.real <- cdcat.summary(cdcat.obj = res.VARJ, alpha = att)
res.VARJ.sum.real$alpha.recovery$PCV
res.VARJ.sum.real$item.exposure$stats
res.VARJ.sum.real$item.exposure$length.plot
res.VARJ.sum.real$item.exposure$exp.plot
# vs. maximum observable accuracy
att.J <- GDINA::personparm(fit, "MAP")[, -(K+1)] # GDINA package
# att.J <- t(sapply(strsplit(as.character(fit$pattern$map.est), ""), as.numeric)) # CDM package
class.J <- GDINA::ClassRate(att, att.J) # upper-limit for accuracy
res.FIXJ.sum.obse <- cdcat.summary(cdcat.obj = res.FIXJ, alpha = att.J)
res.FIXJ.sum.obse$alpha.recovery$PCV.plot + ggplot2::geom_hline(yintercept = class.J$PCV[K],
                                                        color = "firebrick3")
res.FIXJ.sum.obse$alpha.recovery$PCA.plot + ggplot2::geom_hline(yintercept = class.J$PCA,
                                                        color = "firebrick3")

######################################
# Example 2.                         #
# CD-CAT simulation for multiple     #
# GDINA objs and comparison of       #
# performance on a validation sample #
######################################

#----------------Data----------------#
Q <- sim180combination$simQ
K <- ncol(Q)
parm <- sim180combination$specifications$item.bank$simcatprob.parm
dat.c <- sim180combination$simdat[,,1]
att.c <- sim180combination$simalpha[,,1]
dat.v <- sim180combination$simdat[,,2]
att.v <- sim180combination$simalpha[,,2]

#-----(multiple) Model estimation----#
fitTRUE <- GDINA::GDINA(dat = dat.c, Q = Q, catprob.parm = parm,
           control = list(maxitr = 0), verbose = 0)

fitGDINA <- GDINA::GDINA(dat = dat.c, Q = Q, verbose = 0)
fitDINA <- GDINA::GDINA(dat = dat.c, Q = Q, model = "DINA", verbose = 0)
LR2step <- LR.2step(fitGDINA)
models <- LR2step$models.adj.pvalues
fitLR2 <- GDINA::GDINA(dat = dat.c, Q = Q, model = models, verbose = 0)

#---------------CD-CAT---------------#
fit.l <- list(fitTRUE, fitLR2, fitGDINA, fitDINA)
res.FIXJ.l <- lapply(fit.l, function(x)  cdcat(dat = dat.v,fit = x,
                                              FIXED.LENGTH = TRUE, n.cores = 2))
res.VARJ.l <- lapply(fit.l, function(x)  cdcat(dat = dat.v,fit = x,
                                              FIXED.LENGTH = FALSE, n.cores = 2))

#---------------Results--------------#
fitbest <- GDINA::GDINA(dat = dat.v, Q = Q, catprob.parm = parm,
          control = list(maxitr = 1), verbose = 0)
fitbest.acc <- GDINA::personparm(fitbest, "MAP")[, -(K+1)]
class.J <- GDINA::ClassRate(att.v, fitbest.acc) # upper-limit for accuracy
# FIXJ comparison
res.FIXJ.sum <- cdcat.summary(cdcat.obj = res.FIXJ.l, alpha = att.v)
res.FIXJ.sum$recovery$PCVcomp + ggplot2::geom_hline(yintercept = class.J$PCV[K],
                                                   color = "firebrick3")
res.FIXJ.sum$recovery$PCAmcomp + ggplot2::geom_hline(yintercept = class.J$PCA,
                                                   color = "firebrick3")
res.FIXJ.sum$item.exposure$stats
res.FIXJ.sum$item.exposure$plot
# VARJ comparison
res.VARJ.sum <- cdcat.summary(cdcat.obj = res.VARJ.l, alpha = att.v)
res.VARJ.sum$recovery
res.VARJ.sum$item.exposure$stats
res.VARJ.sum$item.exposure$plot
res.VARJ.sum$CATlength$stats
res.VARJ.sum$CATlength$plot

######################################
# Example 3.                         #
# Nonparametric CD-CAT for           #
# small-scale assessment (NPS)       #
######################################

#-----------Data----------#
Q <- sim180DINA$simQ
K <- ncol(Q)
N <- 50
dat <- sim180DINA$simdat[1:N,]
att <- sim180DINA$simalpha[1:N,]

#--------Nonparametric CD-CAT--------#
res.NPS.FIXJ <- cdcat(dat = dat, itemSelect = "NPS", FIXED.LENGTH = TRUE,
                     MAXJ = 25, n.cores = 2,
                     NP.args = list(Q = Q, gate = "AND", pseudo.prob = TRUE, w.type = 2),
                     seed = 12345)
res.NPS.VARJ <- cdcat(dat = dat, itemSelect = "NPS", FIXED.LENGTH = FALSE,
                     MAXJ = 25, precision.cut = 0.90, n.cores = 2,
                     NP.args = list(Q = Q, gate = "AND", pseudo.prob = TRUE, w.type = 2),
                     seed = 12345)

#---------------Results--------------#
res.NPS.FIXJ$est[[1]] # estimates for the first examinee (fixed-length)
res.NPS.VARJ$est[[1]] # estimates for the first examinee (fixed-precision)
att.plot(res.NPS.FIXJ, i = 1) # plot for estimates for the first examinee (fixed-length)
att.plot(res.NPS.VARJ, i = 1) # plot for estimates for the first examinee (fixed-precision)
# FIXJ summary
res.NPS.FIXJ.sum.real <- cdcat.summary(cdcat.obj = res.NPS.FIXJ, alpha = att) # vs. real accuracy
res.NPS.FIXJ.sum.real$alpha.recovery$PCV.plot
res.NPS.FIXJ.sum.real$item.exposure$exp.plot
# VARJ summary
res.NPS.VARJ.sum.real <- cdcat.summary(cdcat.obj = res.NPS.VARJ, alpha = att)
res.NPS.VARJ.sum.real$alpha.recovery$PCV.plot
res.NPS.VARJ.sum.real$item.exposure$stats
res.NPS.VARJ.sum.real$item.exposure$length.plot
res.NPS.VARJ.sum.real$item.exposure$exp.plot
# vs. maximum observable accuracy
fit <- NPCD::AlphaNP(Y = dat, Q = Q, gate = "AND")
att.J <- fit$alpha.est
class.J <- GDINA::ClassRate(att, att.J) # upper-limit for accuracy
res.NPS.FIXJ.sum.obse <- cdcat.summary(cdcat.obj = res.NPS.FIXJ, alpha = att.J)
res.NPS.FIXJ.sum.obse$alpha.recovery$PCV.plot + ggplot2::geom_hline(yintercept = class.J$PCV[K],
                                                            color = "firebrick3")
res.NPS.FIXJ.sum.obse$alpha.recovery$PCA.plot + ggplot2::geom_hline(yintercept = class.J$PCA,
                                                            color = "firebrick3")
                                                            
######################################
# Example 4.                         #
# Nonparametric CD-CAT for           #
# small-scale assessment (GNPS)      #
######################################

#-----------Data----------#
Q <- sim180DINA$simQ
K <- ncol(Q)
N <- 50
dat <- sim180DINA$simdat[1:N,]
att <- sim180DINA$simalpha[1:N,]

#----------Model calibration----------#
gnpc <- cdmTools::GNPC(dat = dat, Q = Q, verbose = 0)

#--------Nonparametric CD-CAT--------#
res.GNPS.FIXJ <- cdcat(fit = gnpc, dat = dat, itemSelect = "GNPS", FIXED.LENGTH = TRUE,
                    MAXJ = 25, n.cores = 2, 
                    NP.args = list(Q = Q, gate = "AND", PPP = TRUE, w.type = 2),
                    seed = 12345)
res.GNPS.VARJ <- cdcat(fit = gnpc, dat = dat, itemSelect = "GNPS", FIXED.LENGTH = FALSE,
                    MAXJ = 25, precision.cut = 0.90, n.cores = 2, 
                    NP.args = list(Q = Q, gate = "AND", PPP = TRUE, w.type = 2),
                      seed = 12345)

#---------------Results--------------#
res.GNPS.FIXJ$est[[1]] # estimates for the first examinee (fixed-length)
res.GNPS.VARJ$est[[1]] # estimates for the first examinee (fixed-precision)
att.plot(res.GNPS.FIXJ, i = 1) # plot for estimates for the first examinee (fixed-length)
att.plot(res.GNPS.VARJ, i = 1) # plot for estimates for the first examinee (fixed-precision)
# FIXJ summary
res.GNPS.FIXJ.sum.real <- cdcat.summary(cdcat.obj = res.GNPS.FIXJ, alpha = att) # vs. real accuracy
res.GNPS.FIXJ.sum.real$alpha.recovery$PCV.plot
res.GNPS.FIXJ.sum.real$item.exposure$exp.plot
# VARJ summary
res.GNPS.VARJ.sum.real <- cdcat.summary(cdcat.obj = res.GNPS.VARJ, alpha = att)
res.GNPS.VARJ.sum.real$alpha.recovery$PCV.plot
res.GNPS.VARJ.sum.real$item.exposure$exp.plot
res.GNPS.VARJ.sum.real$item.exposure$length.plot


[Package cdcatR version 1.0.6 Index]