modelfit.cor {CDM} | R Documentation |
Assessing Model Fit and Local Dependence by Comparing Observed and Expected Item Pair Correlations
Description
This function computes several measures of absolute model fit and local dependence indices for dichotomous item responses which are based on comparing observed and expected frequencies of item pairs (Chen, de la Torre & Zhang, 2013; see Details).
Usage
modelfit.cor(data, posterior, probs)
modelfit.cor2(data, posterior, probs)
modelfit.cor.din( dinobj, jkunits=0 )
## S3 method for class 'modelfit.cor.din'
summary(object, ...)
Arguments
data |
An |
posterior |
A matrix containing the posterior distribution (e.g. obtained as
an output of the |
probs |
An array of dimension [items,categories,attribute classes] containing probabilities |
dinobj |
An object of class |
object |
An object of class |
jkunits |
Number of Jackknife units. The default is to use 0 units
(no use of jackknifing). If jackknife estimation should be
employed, use (say) at least 20 jackknife units.
The input |
... |
Further arguments to be passed |
Details
The fit statistics are based on predictions of the pairwise table
(X_i, X_j)
of item responses. The \chi^2
statistic X2
for
item pairs i
and j
is defined as
\chi^2_{ij}=\sum_{k=0}^1 \sum_{l=0}^1 \frac{ (n_{ij,kl}-e_{ij,kl}) ^2 }{ e_{ij,kl} }
where n_{ij,kl}
is the absolute frequency of \{ X_{i}=k,X_j=l\}
and e_{ij,kl}
is the expected frequency using the estimated model.
Note that for calculating e_{ij,kl}
, individual posterior distributions
are evaluated. The \chi^2_{ij}
statistic is chi-square distributed with one
degree of freedom and can be used for testing whether items i
and
j
are locally dependent. To control for multiple comparisons,
p-value adjustments according to the Holm and FDR method are conducted
(see stats::p.adjust
).
The residual covariance RESIDCOV
of item pairs (i,j)
is calculated
as
RESIDCOV_{ij}=
\frac{ n_{ij,11} n_{ij,00} - n_{ij,10} n_{ij,01} }{n^2 } -
\frac{ e_{ij,11} e_{ij,00} - e_{ij,10} e_{ij,01} }{n^2 }
where MRESIDCOV
is the average of all RESIDCOV
statistics
and is the total sample size.
The statistic MADcor
denotes the average absolute deviation between
observed correlations r_{ij}
and model predicted correlations
\hat{r}_{ij}
of item pairs (i,j)
:
MADcor=\frac{1}{ J(J-1)/2 } \sum_{i < j} | r_{ij} - \hat{r}_{ij} |
The SRMSR (standardized root mean square root of squared residuals, Maydeu-Olivares, 2013) is also based on comparing these correlations
SRMSR=\sqrt{ \frac{1}{ J(J-1)/2 } \sum_{i < j}
( r_{ij} - \hat{r}_{ij} )^2 }
For calculating MADQ3
and MADaQ3
,
residuals \varepsilon_{ni}=X_{ni} - e_{ni}
of
observed and expected responses for respondents n
and items i
are
constructed. Then, the average of the absolute values of pairwise correlations
of these residuals is computed for MADQ3
. For MADaQ3
, the average
of the centered pairwise values (i.e. by subtracting the average Q3 statistic)
is calculated.
The difference of Fisher transformed correlations (Chen et al., 2013) is also computed and used for assessing statistical inference.
For every of the fit statistics MADcor
, MADacor
, SRMSR
, MX2
,
100*MADRESIDCOV
and MADQ3
it holds that smaller values
(values near to zero) indicate better fit.
Standard errors and confidence intervals of fit statistics are obtained by Jackknife estimation.
Value
A list with following entries
modelfit.stat |
Model fit statistics:
|
modelfit.test |
Test of global absolute model fit using test
statistics of all item pairs. The statistic |
itempairs |
Fit of itempairs which can be used for inspection of local
dependence. The |
Note
The function does not handle sample weights properly.
The function modelfit.cor2
has the same functionality as
modelfit.cor
but it is much faster because it is based on
Rcpp code.
References
Chen, J., de la Torre, J., & Zhang, Z. (2013). Relative and absolute fit evaluation in cognitive diagnosis modeling. Journal of Educational Measurement, 50, 123-140.
Chen, W., & Thissen, D. (1997). Local dependence indexes for item pairs using item response theory. Journal of Educational and Behavioral Statistics, 22, 265-289.
DiBello, L. V., Roussos, L. A., & Stout, W. F. (2007). Review of cognitively diagnostic assessment and a summary of psychometric models. In C. R. Rao and S. Sinharay (Eds.), Handbook of Statistics, Vol. 26 (pp. 979–1030). Amsterdam: Elsevier.
Maydeu-Olivares, A. (2013). Goodness-of-fit assessment of item response theory models (with discussion). Measurement: Interdisciplinary Research and Perspectives, 11, 71-137.
Maydeu-Olivares, A., & Joe, H. (2014). Assessing approximate fit in categorical data analysis. Multivariate Behavioral Research, 49, 305-328.
McDonald, R. P., & Mok, M. M.-C. (1995). Goodness of fit in item response models. Multivariate Behavioral Research, 30, 23-40.
Yen, W. M. (1984). Effects of local item dependence on the fit and equating performance of the three-parameter logistic model. Applied Psychological Measurement, 8, 125-145.
Examples
## Not run:
#############################################################################
# EXAMPLE 1: Model fit for sim.dina
#############################################################################
data(sim.dina, package="CDM")
data(sim.qmatrix, package="CDM")
dat <- sim.dina
q.matrix <- sim.qmatrix
#*** Model 1: DINA model for DINA simulated data
mod1 <- CDM::din(dat, q.matrix=q.matrix, rule="DINA" )
fmod1 <- CDM::modelfit.cor.din(mod1, jkunits=10)
summary(fmod1)
## Test of Global Model Fit
## type value p
## 1 max(X2) 8.728 0.113
## 2 abs(fcor) 0.143 0.080
##
## Fit Statistics
## est jkunits jk_est jk_se est_low est_upp
## MADcor 0.030 10 0.020 0.005 0.010 0.030
## SRMSR 0.040 10 0.023 0.006 0.011 0.035
## 100*MADRESIDCOV 0.671 10 0.445 0.125 0.200 0.690
## MADQ3 0.062 10 0.037 0.008 0.021 0.052
## MADaQ3 0.059 10 0.034 0.008 0.019 0.050
# look at first five item pairs with highest degree of local dependence
itempairs <- fmod1$itempairs
itempairs <- itempairs[ order( itempairs$X2, decreasing=TRUE ), ]
itempairs[ 1:5, c("item1","item2", "X2", "X2_p", "X2_p.holm", "Q3") ]
## item1 item2 X2 X2_p X2_p.holm Q3
## 29 Item5 Item8 8.728248 0.003133174 0.1127943 -0.26616414
## 32 Item6 Item8 2.644912 0.103881881 1.0000000 0.04873154
## 21 Item3 Item9 2.195011 0.138458201 1.0000000 0.05948456
## 10 Item2 Item4 1.449106 0.228671389 1.0000000 -0.08036216
## 30 Item5 Item9 1.393583 0.237800911 1.0000000 -0.01934420
#*** Model 2: DINO model for DINA simulated data
mod2 <- CDM::din(dat, q.matrix=q.matrix, rule="DINO" )
fmod2 <- CDM::modelfit.cor.din(mod2, jkunits=10 ) # 10 jackknife units
summary(fmod2)
## Test of Global Model Fit
## type value p
## 1 max(X2) 13.139 0.010
## 2 abs(fcor) 0.199 0.001
##
## Fit Statistics
## est jkunits jk_est jk_se est_low est_upp
## MADcor 0.056 10 0.041 0.007 0.026 0.055
## SRMSR 0.072 10 0.045 0.019 0.007 0.083
## 100*MADRESIDCOV 1.225 10 0.878 0.183 0.519 1.236
## MADQ3 0.073 10 0.055 0.012 0.031 0.080
## MADaQ3 0.073 10 0.066 0.012 0.042 0.089
#*** Model 3: estimate DINA model with gdina function
mod3 <- CDM::gdina( dat, q.matrix=q.matrix, rule="DINA" )
fmod3 <- CDM::modelfit.cor.din( mod3, jkunits=0 ) # no Jackknife estimation
summary(fmod3)
## Test of Global Model Fit
## type value p
## 1 max(X2) 8.756 0.111
## 2 abs(fcor) 0.143 0.078
##
## Fit Statistics
## est
## MADcor 0.030
## SRMSR 0.040
## MX2 0.719
## 100*MADRESIDCOV 0.668
## MADQ3 0.062
## MADaQ3 0.059
#############################################################################
# EXAMPLE 2: Simulated Example DINA model
#############################################################################
set.seed(9765)
# specify Q-matrix
Q <- matrix( c(1,0, 0,1, 1,1 ), nrow=3, ncol=2, byrow=TRUE )
q.matrix <- Q[ rep(1:3,4), ]
I <- nrow(q.matrix)
# simulate data
guess <- stats::runif(I, 0, .3 )
slip <- stats::runif( I, 0, .4 )
N <- 150 # number of persons
dat <- CDM::sim.din( N=N, q.matrix=q.matrix, slip=slip, guess=guess )$dat
#*** estmate DINA model
mod1 <- CDM::din( dat, q.matrix=q.matrix, rule="DINA" )
fmod1 <- CDM::modelfit.cor.din(mod1, jkunits=10)
summary(fmod1)
## Test of Global Model Fit
## type value p
## 1 max(X2) 10.697 0.071
## 2 abs(fcor) 0.277 0.026
##
## Fit Statistics
## est jkunits jk_est jk_se est_low est_upp
## MADcor 0.052 10 0.026 0.010 0.006 0.045
## SRMSR 0.074 10 0.048 0.013 0.022 0.074
## 100*MADRESIDCOV 1.259 10 0.646 0.213 0.228 1.063
## MADQ3 0.080 10 0.047 0.010 0.027 0.068
## MADaQ3 0.079 10 0.046 0.010 0.027 0.065
## End(Not run)