R: CDM fit comparison - dimensionality assessment method

modelcompK {cdmTools}

R Documentation

CDM fit comparison - dimensionality assessment method

Description

A procedure for determining the number of attributes underlying CDM using model fit comparison. For each number of attributes under exploration, a Q-matrix is estimated from the data using the discrete factor loading method (Wang, Song, & Ding, 2018), which can be further validated using the Hull method (Nájera, Sorrel, de la Torre, & Abad, 2020). Then, a CDM is fitted to the data using the resulting Q-matrix, and several fit indices are computed. After the desired range of number of attributes has been explored, the fit indices are compared. A suggested number of attributes is given for each fit index. The AIC index should be preferred among the other fit indices. For further details, see Nájera, Abad, & Sorrel (2021). This function can be also used by directly providing different Q-matrices (instead of estimating them from the data) in order to compare their fit and select the most appropriate Q-matrix. Note that, if Q-matrices are provided, this function will no longer serve as a dimensionality assessment method, but just as an automated model comparison procedure.

Usage

modelcompK(
  dat,
  exploreK = 1:7,
  Qs = NULL,
  stop = "none",
  val.Q = TRUE,
  estQ.args = list(criterion = "row", cor = "tet", rotation = "oblimin", fm = "uls"),
  valQ.args = list(index = "PVAF", iterative = "test.att", maxitr = 5, CDMconv = 0.01),
  verbose = TRUE
)

Arguments

`dat`	A N individuals x J items (`matrix` or `data.frame`). Missing values need to be coded as `NA`.
`exploreK`	Number of attributes to explore. The default is from 1 to 7 attributes.
`Qs`	A list of Q-matrices to compare in terms of fit. If `Qs` is used, `exploreK` is ignored.
`stop`	A fit index to use for stopping the procedure if a model leads to worse fit than a simpler one. This can be useful for saving time without exploring the whole exploreK when it is probable that the correct dimensionality has been already visited. It includes `"AIC"`, `"BIC"`, `"CAIC"`, `"SABIC"`, `"M2"`, `"SRMSR"`, `"RMSEA2"`, or `"sig.item.pairs"`. The latter represents the number of items that show bad fit with at least another item based on the transformed correlations (see `itemfit` function in the `GDINA` package; Ma & de la Torre, 2020). It can be also `"none"`, which means that the whole `exploreK` will be examined. The default is `"none"`.
`val.Q`	Validate the estimated Q-matrices using the Hull method? Note that validating the Q-matrix is expected to increase its quality, but the computation time will increase. The default is `TRUE`.
`estQ.args`	A list of arguments for the discrete factor loading empirical Q-matrix estimation method (see the `estQ` function): `criterion` Dichotomization criterion to transform the factor loading matrix into the Q-matrix. The possible options include `"row"` (for row means), `"col"` (for column means), `"loaddiff"` (for the procedure based on loading differences), or a value between 0 and 1 (for a specific threshold). The default is `"row"`. `cor` Type of correlations to use. It includes `"cor"` (for Pearson correlations) and `"tet"` (for tetrachoric/polychoric correlations), among others. See `fa` function from the `psych` R package for additional details. The default is `"tet"`. `rotation` Rotation procedure to use. It includes `"oblimin"`, `"varimax"`, and `"promax"`, among others. An oblique rotation procedure is usually recommended. See `fa` function from the `psych` R package for additional details. The default is `"oblimin"`. `fm` Factoring method to use. It includes `"uls"` (for unweighted least squares), `"ml"` (for maximum likelihood), and `"wls"` (for weighted least squares), among others. See `fa` function from the `psych` R package for additional details. The default is `"uls"`.
`valQ.args`	A list of arguments for the Hull empirical Q-matrix validation method. Only applicable if `valQ = TRUE` (see the `valQ` function): `index` What index to use. It includes `"PVAF"` or `"R2"`. The default is `"PVAF"`. `iterative` (Iterative) implementation procedure. It includes `"none"` (for non-iterative), `"test"` (for test-level iterations), `"test.att"` (for test-level iterations modifying the least possible amount of q-entries in each iteration), and `"item"` (for item-level iterations). The default is `"test.att"`. `maxitr` Maximum number of iterations if an iterative procedure has been selected. The default is 5. `CDMconv` Convergence criteria for the CDM estimations between iterations (only if an iterative procedure has been selected). The default is 0.01.
`verbose`	Show progress? The default is `TRUE`.

Value

modelcompK returns an object of class modelcompK.

sug.K: The suggested number of attributes for each fit index (vector). Only if Qs = NULL.
sel.Q: The suggested Q-matrix for each fit index (vector).
fit: The fit indices for each fitted model (matrix).
exp.exploreK: Explored dimensionality (vector). It can be different from exploreK if stop has been used.
usedQ: Q-matrices used to fit each model (list). They will be the estimated (and validated) Q-matrices if Qs = NULL. Otherwise, they will be Qs.
specifications: Function call specifications (list).

Author(s)

Pablo Nájera, Universidad Pontificia Comillas
Miguel A. Sorrel, Universidad Autónoma de Madrid
Francisco J. Abad, Universidad Autónoma de Madrid

References

Ma, W., & de la Torre, J. (2020). GDINA: An R package for cognitive diagnosis modeling. Journal of Statistical Software, 93(14). https://doi.org/10.18637/jss.v093.i14

Nájera, P., Abad, F. J., & Sorrel, M. A. (2021). Determining the number of attributes in cognitive diagnosis modeling. Frontiers in Psychology, 12:614470. https://doi.org/10.3389/fpsyg.2021.614470

Nájera, P., Sorrel, M. A., de la Torre, J., & Abad, F. J. (2020). Balancing fit and parsimony to improve Q-matrix validation. British Journal of Mathematical and Statistical Psychology. https://doi.org/10.1111/bmsp.12228

Wang, W., Song, L., & Ding, S. (2018). An exploratory discrete factor loading method for Q-matrix specification in cognitive diagnosis models. In: M. Wilberg, S. Culpepper, R. Janssen, J. González, & D. Molenaar (Eds.), Quantitative Psychology. IMPS 2017. Springer Proceedings in Mathematics & Statistics (Vol. 233, pp. 351-362). Springer.

Examples


library(GDINA)
dat <- sim30GDINA$simdat
Q <- sim30GDINA$simQ

#-------------------------------------
# Assess dimensionality from CDM data
#-------------------------------------
mcK <- modelcompK(dat = dat, exploreK = 4:7, stop = "AIC", val.Q = TRUE, verbose = TRUE)
mcK$sug.K # Check suggested number of attributes by each fit index
mcK$fit # Check fit indices for each K explored
sug.Q <- mcK$usedQ[[paste0("K", mcK$sug.K["AIC"])]] # Suggested Q-matrix by AIC
sug.Q <- orderQ(sug.Q, Q)$order.Q # Reorder Q-matrix attributes
mean(sug.Q == Q) # Check similarity with the generating Q-matrix

#--------------------------------------------------
# Automatic fit comparison of competing Q-matrices
#--------------------------------------------------
trueQ <- Q
missQ1 <- missQ(Q, .10, seed = 123)$miss.Q
missQ2 <- missQ(Q, .20, seed = 456)$miss.Q
missQ3 <- missQ(Q, .30, seed = 789)$miss.Q
Qs <- list(trueQ, missQ1, missQ2, missQ3)
mc <- modelcompK(dat = dat, Qs = Qs, verbose = TRUE)
mc$sel.Q # Best-fitting Q-matrix for each fit index
mc$fit # Check fit indices for each Q explored

[Package cdmTools version 1.0.5 Index]