R: Multidimensional Noncompensatory, Compensatory and Partially...

smirt {sirt}

R Documentation

Multidimensional Noncompensatory, Compensatory and Partially Compensatory Item Response Model

Description

This function estimates the noncompensatory and compensatory multidimensional item response model (Bolt & Lall, 2003; Reckase, 2009) as well as the partially compensatory item response model (Spray et al., 1990) for dichotomous data.

Usage

smirt(dat, Qmatrix, irtmodel="noncomp", est.b=NULL, est.a=NULL,
     est.c=NULL, est.d=NULL, est.mu.i=NULL, b.init=NULL, a.init=NULL,
     c.init=NULL, d.init=NULL, mu.i.init=NULL, Sigma.init=NULL,
     b.lower=-Inf, b.upper=Inf, a.lower=-Inf, a.upper=Inf,
     c.lower=-Inf, c.upper=Inf, d.lower=-Inf, d.upper=Inf,
     theta.k=seq(-6,6,len=20), theta.kDES=NULL,
     qmcnodes=0, mu.fixed=NULL, variance.fixed=NULL,  est.corr=FALSE,
     max.increment=1, increment.factor=1, numdiff.parm=0.0001,
     maxdevchange=0.1, globconv=0.001, maxiter=1000, msteps=4,
     mstepconv=0.001)

## S3 method for class 'smirt'
summary(object,...)

## S3 method for class 'smirt'
anova(object,...)

## S3 method for class 'smirt'
logLik(object,...)

## S3 method for class 'smirt'
IRT.irfprob(object,...)

## S3 method for class 'smirt'
IRT.likelihood(object,...)

## S3 method for class 'smirt'
IRT.posterior(object,...)

## S3 method for class 'smirt'
IRT.modelfit(object,...)

## S3 method for class 'IRT.modelfit.smirt'
summary(object,...)

Arguments

`dat`	Data frame with dichotomous item responses
`Qmatrix`	The Q-matrix which specifies the loadings to be estimated
`irtmodel`	The item response model. Options are the noncompensatory model (`"noncomp"`), the compensatory model (`"comp"`) and the partially compensatory model (`"partcomp"`). See Details for more explanations.
`est.b`	An integer matrix (if `irtmodel="noncomp"`) or integer vector (if `irtmodel="comp"`) for `b` parameters to be estimated
`est.a`	An integer matrix for `a` parameters to be estimated. If `est.a="2PL"`, then all item loadings will be estimated and the variances are set to one (and therefore `est.corr=TRUE`).
`est.c`	An integer vector for `c` parameters to be estimated
`est.d`	An integer vector for `d` parameters to be estimated
`est.mu.i`	An integer vector for `\mu_i` parameters to be estimated
`b.init`	Initial `b` coefficients. For `irtmodel="noncomp"` it must be a matrix, for `irtmodel="comp"` it is a vector.
`a.init`	Initial `a` coefficients arranged in a matrix
`c.init`	Initial `c` coefficients
`d.init`	Initial `d` coefficients
`mu.i.init`	Initial `d` coefficients
`Sigma.init`	Initial covariance matrix `\Sigma`
`b.lower`	Lower bound for `b` parameter
`b.upper`	Upper bound for `b` parameter
`a.lower`	Lower bound for `a` parameter
`a.upper`	Upper bound for `a` parameter
`c.lower`	Lower bound for `c` parameter
`c.upper`	Upper bound for `c` parameter
`d.lower`	Lower bound for `d` parameter
`d.upper`	Upper bound for `d` parameter
`theta.k`	Vector of discretized trait distribution. This vector is expanded in all dimensions by using the `base::expand.grid` function. If a user specifies a design matrix `theta.kDES` of transformed `\bold{\theta}_p` values (see Details and Examples), then `theta.k` must be a matrix, too.
`theta.kDES`	An optional design matrix. This matrix will differ from the ordinary theta grid in case of nonlinear item response models.
`qmcnodes`	Number of integration nodes for quasi Monte Carlo integration (see Pan & Thompson, 2007; Gonzales et al., 2006). Integration points are obtained by using the function `qmc.nodes`. Note that when using quasi Monte Carlo nodes, no theta design matrix `theta.kDES` can be specified. See Example 1, Model 11.
`mu.fixed`	Matrix with fixed entries in the mean vector. By default, all means are set to zero.
`variance.fixed`	Matrix (with rows and three columns) with fixed entries in the covariance matrix (see Examples). The entry `c_{kd}` of the covariance between dimensions `k` and `d` is set to `c_0` iff `variance.fixed` has a row with a `k` in the first column, a `d` in the second column and the value `c_0` in the third column.
`est.corr`	Should only a correlation matrix instead of a covariance matrix be estimated?
`max.increment`	Maximum increment
`increment.factor`	A value (larger than one) which defines the extent of the decrease of the maximum increment of item parameters in every iteration. The maximum increment in iteration `iter` is defined as `max.increment*increment.factor^(-iter)` where `max.increment=1`. Using a value larger than 1 helps to reach convergence in some non-converging analyses (use values of 1.01, 1.02 or even 1.05). See also Example 1 Model 2a.
`numdiff.parm`	Numerical differentiation parameter
`maxdevchange`	Convergence criterion for change in relative deviance
`globconv`	Global convergence criterion for parameter change
`maxiter`	Maximum number of iterations
`msteps`	Number of iterations within a M step
`mstepconv`	Convergence criterion within a M step
`object`	Object of class `smirt`
`...`	Further arguments to be passed

Details

The noncompensatory item response model (irtmodel="noncomp"; e.g. Bolt & Lall, 2003) is defined as

P(X_{pi}=1 | \bold{\theta}_p )=c_i + (d_i - c_i ) \prod_l invlogit( a_{il} q_{il} \theta_{pl} - b_{il} )

where i, p, l denote items, persons and dimensions respectively.

The compensatory item response model (irtmodel="comp") is defined by

P(X_{pi}=1 | \bold{\theta}_p )=c_i + (d_i - c_i ) invlogit( \sum_l a_{il} q_{il} \theta_{pl} - b_{i} )

Using a design matrix theta.kDES the model can be made even more general in a model which is linear in item parameters

P(X_{pi}=1 | \bold{\theta}_p )=c_i + (d_i - c_i ) invlogit( \sum_l a_{il} q_{il} t_l ( \bold{ \theta_{p} } ) - b_{i} )

with known functions t_l of the trait vector \bold{\theta}_p. Fixed values of the functions t_l are specified in the \bold{\theta}_p design matrix theta.kDES.

The partially compensatory item response model (irtmodel="partcomp") is defined by

P(X_{pi}=1 | \bold{\theta}_p )=c_i + (d_i - c_i ) \frac{ \exp \left( \sum_l ( a_{il} q_{il} \theta_{pl} - b_{il} ) \right) } { \mu_i \prod_l ( 1 + \exp ( a_{il} q_{il} \theta_{pl} - b_{il} ) ) + ( 1- \mu_i) ( 1 + \exp \left( \sum_l ( a_{il} q_{il} \theta_{pl} - b_{il} ) \right) ) }

with item parameters \mu_i indicating the degree of compensatory. \mu_i=1 indicates a noncompensatory model while \mu_i=0 indicates a (fully) compensatory model.

The models are estimated by an EM algorithm employing marginal maximum likelihood.

Value

A list with following entries:

`deviance`	Deviance
`ic`	Information criteria
`item`	Data frame with item parameters
`person`	Data frame with person parameters. It includes the person mean of all item responses (`M`; percentage correct of all non-missing items), the EAP estimate and its corresponding standard error for all dimensions (`EAP` and `SE.EAP`) and the maximum likelihood estimate as well as the mode of the posterior distribution (`MLE` and `MAP`).
`EAP.rel`	EAP reliability
`mean.trait`	Means of trait
`sd.trait`	Standard deviations of trait
`Sigma`	Trait covariance matrix
`cor.trait`	Trait correlation matrix
`b`	Matrix (vector) of `b` parameters
`se.b`	Matrix (vector) of standard errors `b` parameters
`a`	Matrix of `a` parameters
`se.a`	Matrix of standard errors of `a` parameters
`c`	Vector of `c` parameters
`se.c`	Vector of standard errors of `c` parameters
`d`	Vector of `d` parameters
`se.d`	Vector of standard errors of `d` parameters
`mu.i`	Vector of `\mu_i` parameters
`se.mu.i`	Vector of standard errors of `\mu_i` parameters
`f.yi.qk`	Individual likelihood
`f.qk.yi`	Individual posterior
`probs`	Probabilities of item response functions evaluated at `theta.k`
`n.ik`	Expected counts
`iter`	Number of iterations
`dat2`	Processed data set
`dat2.resp`	Data set of response indicators
`I`	Number of items
`D`	Number of dimensions
`K`	Maximum item response score
`theta.k`	Used theta integration grid
`pi.k`	Distribution function evaluated at `theta.k`
`irtmodel`	Used IRT model
`Qmatrix`	Used Q-matrix

References

Bolt, D. M., & Lall, V. F. (2003). Estimation of compensatory and noncompensatory multidimensional item response models using Markov chain Monte Carlo. Applied Psychological Measurement, 27, 395-414.

Gonzalez, J., Tuerlinckx, F., De Boeck, P., & Cools, R. (2006). Numerical integration in logistic-normal models. Computational Statistics & Data Analysis, 51, 1535-1548.

Pan, J., & Thompson, R. (2007). Quasi-Monte Carlo estimation in generalized linear mixed models. Computational Statistics & Data Analysis, 51, 5765-5775.

Reckase, M. (2009). Multidimensional item response theory. New York: Springer. doi:10.1007/978-0-387-89976-3

Spray, J. A., Davey, T. C., Reckase, M. D., Ackerman, T. A., & Carlson, J. E. (1990). Comparison of two logistic multidimensional item response theory models. ACT Research Report No. ACT-RR-ONR-90-8.

Examples

#############################################################################
## EXAMPLE 1: Noncompensatory and compensatory IRT models
#############################################################################
set.seed(997)

# (1) simulate data from a two-dimensional noncompensatory
#     item response model
#   -> increase number of iterations in all models!

N <- 1000    # number of persons
I <- 10        # number of items
theta0 <- rnorm( N, sd=1 )
theta1 <- theta0 + rnorm(N, sd=.7 )
theta2 <- theta0 + rnorm(N, sd=.7 )
Q <- matrix( 1, nrow=I,ncol=2 )
Q[ 1:(I/2), 2 ] <- 0
Q[ I,1] <- 0
b <- matrix( rnorm( I*2 ), I, 2 )
a <- matrix( 1, I, 2 )

# simulate data
prob <- dat <- matrix(0, nrow=N, ncol=I )
for (ii in 1:I){
prob[,ii] <- ( stats::plogis( theta1 - b[ii,1]  ) )^Q[ii,1]
prob[,ii] <- prob[,ii] * ( stats::plogis( theta2 - b[ii,2]  ) )^Q[ii,2]
            }
dat[ prob > matrix( stats::runif( N*I),N,I) ] <- 1
colnames(dat) <- paste0("I",1:I)

#***
# Model 1: Noncompensatory 1PL model
mod1 <- sirt::smirt(dat, Qmatrix=Q, maxiter=10 ) # change number of iterations
summary(mod1)

## Not run: 
#***
# Model 2: Noncompensatory 2PL model
mod2 <- sirt::smirt(dat,Qmatrix=Q, est.a="2PL", maxiter=15 )
summary(mod2)

# Model 2a: avoid convergence problems with increment.factor
mod2a <- sirt::smirt(dat,Qmatrix=Q, est.a="2PL", maxiter=30, increment.factor=1.03)
summary(mod2a)

#***
# Model 3: some fixed c and d parameters different from zero or one
c.init <- rep(0,I)
c.init[ c(3,7)] <- .2
d.init <- rep(1,I)
d.init[c(4,8)] <- .95
mod3 <- sirt::smirt( dat, Qmatrix=Q, c.init=c.init, d.init=d.init )
summary(mod3)

#***
# Model 4: some estimated c and d parameters (in parameter groups)
est.c <- c.init <- rep(0,I)
c.estpars <- c(3,6,7)
c.init[ c.estpars ] <- .2
est.c[c.estpars] <- 1
est.d <- rep(0,I)
d.init <- rep(1,I)
d.estpars <- c(6,9)
d.init[ d.estpars ] <- .95
est.d[ d.estpars ] <- d.estpars   # different d parameters
mod4 <- sirt::smirt(dat,Qmatrix=Q, est.c=est.c, c.init=c.init,
            est.d=est.d, d.init=d.init  )
summary(mod4)

#***
# Model 5: Unidimensional 1PL model
Qmatrix <- matrix( 1, nrow=I, ncol=1 )
mod5 <- sirt::smirt( dat, Qmatrix=Qmatrix )
summary(mod5)

#***
# Model 6: Unidimensional 2PL model
mod6 <- sirt::smirt( dat, Qmatrix=Qmatrix, est.a="2PL" )
summary(mod6)

#***
# Model 7: Compensatory model with between item dimensionality
# Note that the data is simulated under the noncompensatory condition
# Therefore Model 7 should have a worse model fit than Model 1
Q1 <- Q
Q1[ 6:10, 1] <- 0
mod7 <- sirt::smirt(dat,Qmatrix=Q1, irtmodel="comp", maxiter=30)
summary(mod7)

#***
# Model 8: Compensatory model with within item dimensionality
#         assuming zero correlation between dimensions
variance.fixed <- as.matrix( cbind( 1,2,0) )
# set the covariance between the first and second dimension to zero
mod8 <- sirt::smirt(dat,Qmatrix=Q, irtmodel="comp", variance.fixed=variance.fixed,
            maxiter=30)
summary(mod8)

#***
# Model 8b: 2PL model with starting values for a and b parameters
b.init <- rep(0,10)  # set all item difficulties initially to zero
# b.init <- NULL
a.init <- Q       # initialize a.init with Q-matrix
# provide starting values for slopes of first three items on Dimension 1
a.init[1:3,1] <- c( .55, .32, 1.3)

mod8b <- sirt::smirt(dat,Qmatrix=Q, irtmodel="comp", variance.fixed=variance.fixed,
              b.init=b.init, a.init=a.init, maxiter=20, est.a="2PL" )
summary(mod8b)

#***
# Model 9: Unidimensional model with quadratic item response functions
# define theta
theta.k <- seq( - 6, 6, len=15 )
theta.k <- as.matrix( theta.k, ncol=1 )
# define design matrix
theta.kDES <- cbind( theta.k[,1], theta.k[,1]^2 )
# define Q-matrix
Qmatrix <- matrix( 0, I, 2 )
Qmatrix[,1] <- 1
Qmatrix[ c(3,6,7), 2 ] <- 1
colnames(Qmatrix) <- c("F1", "F1sq" )
# estimate model
mod9 <- sirt::smirt(dat,Qmatrix=Qmatrix, maxiter=50, irtmodel="comp",
           theta.k=theta.k, theta.kDES=theta.kDES, est.a="2PL" )
summary(mod9)

#***
# Model 10: Two-dimensional item response model with latent interaction
#           between dimensions
theta.k <- seq( - 6, 6, len=15 )
theta.k <- expand.grid( theta.k, theta.k )    # expand theta to 2 dimensions
# define design matrix
theta.kDES <- cbind( theta.k, theta.k[,1]*theta.k[,2] )
# define Q-matrix
Qmatrix <- matrix( 0, I, 3 )
Qmatrix[,1] <- 1
Qmatrix[ 6:10, c(2,3) ] <- 1
colnames(Qmatrix) <- c("F1", "F2", "F1iF2" )
# estimate model
mod10 <- sirt::smirt(dat,Qmatrix=Qmatrix,irtmodel="comp", theta.k=theta.k,
            theta.kDES=theta.kDES, est.a="2PL" )
summary(mod10)

#****
# Model 11: Example Quasi Monte Carlo integration
Qmatrix <- matrix( 1, I, 1 )
mod11 <- sirt::smirt( dat, irtmodel="comp", Qmatrix=Qmatrix, qmcnodes=1000 )
summary(mod11)

#############################################################################
## EXAMPLE 2: Dataset Reading data.read
##            Multidimensional models for dichotomous data
#############################################################################

data(data.read)
dat <- data.read
I <- ncol(dat)    # number of items

#***
# Model 1: 3-dimensional 2PL model

# define Q-matrix
Qmatrix <- matrix(0,nrow=I,ncol=3)
Qmatrix[1:4,1] <- 1
Qmatrix[5:8,2] <- 1
Qmatrix[9:12,3] <- 1

# estimate model
mod1 <- sirt::smirt( dat, Qmatrix=Qmatrix, irtmodel="comp", est.a="2PL",
            qmcnodes=1000, maxiter=20)
summary(mod1)

#***
# Model 2: 3-dimensional Rasch model
mod2 <- sirt::smirt( dat, Qmatrix=Qmatrix, irtmodel="comp",
              qmcnodes=1000, maxiter=20)
summary(mod2)

#***
# Model 3: 3-dimensional 2PL model with uncorrelated dimensions
# fix entries in variance matrix
variance.fixed <- cbind( c(1,1,2), c(2,3,3), 0 )
# set the following covariances to zero: cov[1,2]=cov[1,3]=cov[2,3]=0

# estimate model
mod3 <- sirt::smirt( dat, Qmatrix=Qmatrix, irtmodel="comp", est.a="2PL",
             variance.fixed=variance.fixed, qmcnodes=1000, maxiter=20)
summary(mod3)

#***
# Model 4: Bifactor model with one general factor (g) and
#          uncorrelated specific factors

# define a new Q-matrix
Qmatrix1 <- cbind( 1, Qmatrix )
# uncorrelated factors
variance.fixed <- cbind( c(1,1,1,2,2,3), c(2,3,4,3,4,4), 0 )
# The first dimension refers to the general factors while the other
# dimensions refer to the specific factors.
# The specification means that:
# Cov[1,2]=Cov[1,3]=Cov[1,4]=Cov[2,3]=Cov[2,4]=Cov[3,4]=0

# estimate model
mod4 <- sirt::smirt( dat, Qmatrix=Qmatrix1, irtmodel="comp", est.a="2PL",
             variance.fixed=variance.fixed, qmcnodes=1000, maxiter=20)
summary(mod4)

#############################################################################
## EXAMPLE 3: Partially compensatory model
#############################################################################

#**** simulate data
set.seed(7656)
I <- 10         # number of items
N <- 2000        # number of subjects
Q <- matrix( 0, 3*I,2)  # Q-matrix
Q[1:I,1] <- 1
Q[1:I + I,2] <- 1
Q[1:I + 2*I,1:2] <- 1
b <- matrix( stats::runif( 3*I *2, -2, 2 ), nrow=3*I, 2 )
b <- b*Q
b <- round( b, 2 )
mui <- rep(0,3*I)
mui[ seq(2*I+1, 3*I) ] <- 0.65
# generate data
dat <- matrix( NA, N, 3*I )
colnames(dat) <- paste0("It", 1:(3*I) )
# simulate item responses
library(mvtnorm)
theta <- mvtnorm::rmvnorm(N, mean=c(0,0), sigma=matrix( c( 1.2, .6,.6,1.6),2, 2 ) )
for (ii in 1:(3*I)){
    # define probability
    tmp1 <- exp( theta[,1] * Q[ii,1] - b[ii,1] +  theta[,2] * Q[ii,2] - b[ii,2] )
    # non-compensatory model
    nco1 <- ( 1 + exp( theta[,1] * Q[ii,1] - b[ii,1] ) ) *
                  ( 1 + exp( theta[,2] * Q[ii,2] - b[ii,2] ) )
    co1 <- ( 1 + tmp1 )
    p1 <- tmp1 / ( mui[ii] * nco1 + ( 1 - mui[ii] )*co1 )
    dat[,ii] <- 1 * ( stats::runif(N) < p1 )
}

#*** Model 1: Joint mu.i parameter for all items
est.mu.i <- rep(0,3*I)
est.mu.i[ seq(2*I+1,3*I)] <- 1
mod1 <- sirt::smirt( dat, Qmatrix=Q, irtmodel="partcomp", est.mu.i=est.mu.i)
summary(mod1)

#*** Model 2: Separate mu.i parameter for all items
est.mu.i[ seq(2*I+1,3*I)] <- 1:I
mod2 <- sirt::smirt( dat, Qmatrix=Q, irtmodel="partcomp", est.mu.i=est.mu.i)
summary(mod2)

## End(Not run)

[Package sirt version 4.1-15 Index]