tam.mml.3pl {TAM} | R Documentation |
3PL Structured Item Response Model in TAM
Description
This estimates a 3PL model with design matrices for item slopes and item intercepts. Discrete distributions of the latent variables are allowed which can be log-linearly smoothed.
Usage
tam.mml.3pl(resp, Y=NULL, group=NULL, formulaY=NULL, dataY=NULL, ndim=1,
pid=NULL, xsi.fixed=NULL, xsi.inits=NULL, xsi.prior=NULL,
beta.fixed=NULL, beta.inits=NULL, variance.fixed=NULL, variance.inits=NULL,
est.variance=TRUE, A=NULL, notA=FALSE, Q=NULL, Q.fixed=NULL, E=NULL,
gammaslope.des="2PL", gammaslope=NULL, gammaslope.fixed=NULL,
est.some.slopes=TRUE, gammaslope.max=9.99, gammaslope.constr.V=NULL,
gammaslope.constr.c=NULL, gammaslope.center.index=NULL, gammaslope.center.value=NULL,
gammaslope.prior=NULL, userfct.gammaslope=NULL, gammaslope.constr.Npars=0,
est.guess=NULL, guess=rep(0, ncol(resp)),
guess.prior=NULL, max.guess=0.5, skillspace="normal", theta.k=NULL,
delta.designmatrix=NULL, delta.fixed=NULL, delta.inits=NULL, pweights=NULL,
item.elim=TRUE, verbose=TRUE, control=list(), Edes=NULL )
## S3 method for class 'tam.mml.3pl'
summary(object,file=NULL,...)
## S3 method for class 'tam.mml.3pl'
print(x,...)
Arguments
resp |
Data frame with polytomous item responses |
Y |
A matrix of covariates in latent regression. Note that the intercept is automatically included as the first predictor. |
group |
An optional vector of group identifiers |
formulaY |
An R formula for latent regression. Transformations of predictors
in |
dataY |
An optional data frame with possible covariates |
ndim |
Number of dimensions (is not needed to determined by the user) |
pid |
An optional vector of person identifiers |
xsi.fixed |
A matrix with two columns for fixing |
xsi.inits |
A matrix with two columns (in the same way defined as in
|
xsi.prior |
Item-specific prior distribution for |
beta.fixed |
A matrix with three columns for fixing regression coefficients.
1st column: Index of |
beta.inits |
A matrix (same format as in |
variance.fixed |
An optional matrix. In case of a single group it is a matrix with three columns for fixing
entries in covariance matrix:
1st column: dimension 1, 2nd column: dimension 2,
3rd column: fixed value of covariance/variance.
In case of multiple groups, it is a matrix with four columns
1st column: group index (from |
variance.inits |
Initial covariance matrix in estimation. All matrix entries have to be
specified and this matrix is NOT in the same format like
|
est.variance |
Should the covariance matrix be estimated? This argument
applies to estimated item slopes in |
A |
An optional array of dimension |
notA |
An optional logical indicating whether all entries in
the |
Q |
An optional |
Q.fixed |
Optional |
E |
Optional design array for item slopes |
gammaslope.des |
Optional string indicating type of item slope parameter to be estimated.
|
gammaslope |
Initial or fixed vector of |
gammaslope.fixed |
An optional matrix containing fixed values in the |
est.some.slopes |
An optional logical indicating whether some item slopes should be estimated. |
gammaslope.max |
Value for absolute entries in |
gammaslope.constr.V |
An optional constraint matrix |
gammaslope.constr.c |
An optional constraint vector |
gammaslope.center.index |
Indices of |
gammaslope.center.value |
Specified values of sum of
subset of |
gammaslope.prior |
Item-specific prior distribution for |
userfct.gammaslope |
A user specified function for
constraints or transformations of the |
gammaslope.constr.Npars |
Number of constrained
|
est.guess |
An optional vector of integers indicating for which items a guessing parameter should be estimated. Same integers correspond to same estimated guessing parameters. A value of 0 denotes an item for which no guessing parameter is estimated. |
guess |
Fixed or initial guessing parameters |
guess.prior |
Item-specific prior distribution for guessing parameters |
max.guess |
Upper bound for guessing parameters |
skillspace |
Skill space: normal distribution ( |
theta.k |
A matrix of the |
delta.designmatrix |
A design matrix of the log-linear model for the skill space in case of a discrete
distribution ( |
delta.fixed |
Fixed |
delta.inits |
Optional initial matrix of |
pweights |
Optional vector of person weights. |
item.elim |
Optional logical indicating whether an item with has
only zero entries should be removed from the analysis. The default
is |
verbose |
Logical indicating whether output should
be printed during iterations. This argument replaces |
control |
See |
Edes |
Compact form of design matrix. This argument is only defined for convenience for models with random starting values to avoid recalculations. |
object |
Object of class |
file |
A file name in which the summary output will be written |
x |
Object of class |
... |
Further arguments to be passed |
Details
The item response model for item and category
and no guessing
parameters can be written as
For item slopes, a linear decomposition is allowed
In case of a guessing parameter, the item response function is
For possibilities of definitions of the design matrix
see Formann (2007), for the strongly related linear logistic latent
class model see Formann (1992). Linear constraints on
can be specified by equations
and using the arguments
gammaslope.constr.V
and gammaslope.constr.c
.
Like in tam.mml
, a multivariate linear regression
assuming a multivariate normal distribution on the residuals
can be specified (
skillspace="normal"
).
Alternatively, a log-linear distribution of the skill classes
(
skillspace="discrete"
)
is performed
See Xu and
von Davier (2008). The design matrix can be specified in
delta.designmatrix
. The theta grid of the skill space
can be defined in
theta.k
.
Value
The same output as in tam.mml
and additional entries
delta |
Parameter vector |
gammaslope |
Estimated |
se.gammaslope |
Standard errors |
E |
Used design matrix |
Edes |
Used design matrix |
guess |
Estimated |
se.guess |
Standard errors |
Note
The implementation of the model builds on pieces work of Anton Formann. See http://www.antonformann.at/ for more information.
References
Formann, A. K. (1992). Linear logistic latent class analysis for polytomous data. Journal of the American Statistical Association, 87, 476-486. doi:10.2307/2290280
Formann, A. K. (2007). (Almost) Equivalence between conditional and mixture maximum likelihood estimates for some models of the Rasch type. In M. von Davier & C. H. Carstensen (Eds.), Multivariate and mixture distribution Rasch models (pp. 177-189). New York: Springer. doi:10.1007/978-0-387-49839-3_11
Xu, X., & von Davier, M. (2008). Fitting the structured general diagnostic model to NAEP data. ETS Research Report ETS RR-08-27. Princeton, ETS. doi:10.1002/j.2333-8504.2008.tb02113.x
See Also
See also tam.mml
.
See the CDM::slca
function in the CDM package
for a similar method.
Examples
## Not run:
#############################################################################
# EXAMPLE 1: Dichotomous data | data.sim.rasch
#############################################################################
data(data.sim.rasch)
dat <- data.sim.rasch
# some control arguments
ctl.list <- list(maxiter=100) # increase the number of iterations in applications!
#*** Model 1: Rasch model, normal trait distribution
mod1 <- TAM::tam.mml.3pl(resp=dat, skillspace="normal", est.some.slopes=FALSE,
control=ctl.list)
summary(mod1)
#*** Model 2: Rasch model, discrete trait distribution
# choose theta grid
theta.k <- seq( -3, 3, len=7 ) # discrete theta grid distribution
# define symmetric trait distribution
delta.designmatrix <- matrix( 0, nrow=7, ncol=4 )
delta.designmatrix[4,1] <- 1
delta.designmatrix[c(3,5),2] <- 1
delta.designmatrix[c(2,6),3] <- 1
delta.designmatrix[c(1,7),4] <- 1
mod2 <- TAM::tam.mml.3pl(resp=dat, skillspace="discrete", est.some.slopes=FALSE,
theta.k=theta.k, delta.designmatrix=delta.designmatrix, control=ctl.list)
summary(mod2)
#*** Model 3: 2PL model
mod3 <- TAM::tam.mml.3pl(resp=dat, skillspace="normal", gammaslope.des="2PL",
control=ctl.list, est.variance=FALSE )
summary(mod3)
#*** Model 4: 3PL model
# estimate guessing parameters for items 3,7,9 and 12
I <- ncol(dat)
est.guess <- rep(0, I )
# set parameters 9 and 12 equal -> same integers
est.guess[ c(3,7,9,12) ] <- c( 1, 3, 2, 2 )
# starting values guessing parameter
guess <- .2*(est.guess > 0)
# estimate model
mod4 <- TAM::tam.mml.3pl(resp=dat, skillspace="normal", gammaslope.des="2PL",
control=ctl.list, est.guess=est.guess, guess=guess, est.variance=FALSE)
summary(mod4)
#--- specification in tamaan
tammodel <- "
LAVAAN MODEL:
F1=~ I1__I40
F1 ~~ 1*F1
I3 + I7 ?=g1
I9 + I12 ?=g912 * g1
"
mod4a <- TAM::tamaan( tammodel, resp=dat, control=list(maxiter=20))
summary(mod4a)
#*** Model 5: 3PL model, add some prior Beta distribution
guess.prior <- matrix( 0, nrow=I, ncol=2 )
guess.prior[ est.guess > 0, 1] <- 5
guess.prior[ est.guess > 0, 2] <- 17
mod5 <- TAM::tam.mml.3pl(resp=dat, skillspace="normal", gammaslope.des="2PL",
control=ctl.list, est.guess=est.guess, guess=guess, guess.prior=guess.prior)
summary(mod5)
#--- specification in tamaan
tammodel <- "
LAVAAN MODEL:
F1=~ I1__I40
F1 ~~ 1*F1
I3 + I7 ?=g1
I9 + I12 ?=g912 * g1
MODEL PRIOR:
g912 ~ Beta(5,17)
I3_guess ~ Beta(5,17)
I7_guess ~ Beta(5,17)
"
mod5a <- TAM::tamaan( tammodel, resp=dat, control=list(maxiter=20))
#*** Model 6: 2PL model with design matrix for item slopes
I <- 40 # number of items
D <- 1 # dimensions
maxK <- 2 # maximum number of categories
Ngam <- 13 # number of different slope parameters
E <- array( 0, dim=c(I,maxK,D,Ngam) )
# joint slope parameters for items 1 to 10, 11 to 20, 21 to 30
E[ 1:10, 2, 1, 2 ] <- 1
E[ 11:20, 2, 1, 1 ] <- 1
E[ 21:30, 2, 1, 3 ] <- 1
for (ii in 31:40){ E[ii,2,1,ii - 27 ] <- 1 }
# estimate model
mod6 <- TAM::tam.mml.3pl(resp=dat, control=ctl.list, E=E, est.variance=FALSE )
summary(mod6)
#*** Model 6b: Truncated normal prior distribution for slope parameters
gammaslope.prior <- matrix( 0, nrow=Ngam, ncol=4 )
gammaslope.prior[,1] <- 2 # mean
gammaslope.prior[,2] <- 10 # standard deviation
gammaslope.prior[,3] <- -Inf # lower bound
gammaslope.prior[ 4:13,3] <- 1.2
gammaslope.prior[,4] <- Inf # upper bound
# estimate model
mod6b <- TAM::tam.mml.3pl(resp=dat, E=E, est.variance=FALSE,
gammaslope.prior=gammaslope.prior, control=ctl.list )
summary(mod6b)
#*** Model 7: 2PL model with design matrix of slopes and slope constraints
Ngam <- dim(E)[4] # number of gamma parameters
# define two constraint equations
gammaslope.constr.V <- matrix( 0, nrow=Ngam, ncol=2 )
gammaslope.constr.c <- rep(0,2)
# set sum of first two xlambda entries to 1.8
gammaslope.constr.V[1:2,1] <- 1
gammaslope.constr.c[1] <- 1.8
# set sum of entries 4, 5 and 6 to 3
gammaslope.constr.V[4:6,2] <- 1
gammaslope.constr.c[2] <- 3
mod7 <- TAM::tam.mml.3pl(resp=dat, control=ctl.list, E=E, est.variance=FALSE,
gammaslope.constr.V=gammaslope.constr.V, gammaslope.constr.c=gammaslope.constr.c)
summary(mod7)
#**** Model 8: Located latent class Rasch model with estimated three skill points
# three classes of theta's are estimated
TP <- 3
theta.k <- diag(TP)
# because item difficulties are unrestricted, we define the sum of the estimated
# theta points equal to zero
Ngam <- TP # estimate three gamma loading parameters which are discrete theta points
E <- array( 0, dim=c(I,2,TP,Ngam) )
E[,2,1,1] <- E[,2,2,2] <- E[,2,3,3] <- 1
gammaslope.constr.V <- matrix( 1, nrow=3, ncol=1 )
gammaslope.constr.c <- c(0)
# initial gamma values
gammaslope <- c( -2, 0, 2 )
# estimate model
mod8 <- TAM::tam.mml.3pl(resp=dat, control=ctl.list, E=E, skillspace="discrete",
theta.k=theta.k, gammaslope=gammaslope, gammaslope.constr.V=gammaslope.constr.V,
gammaslope.constr.c=gammaslope.constr.c )
summary(mod8)
#*** Model 9: Multidimensional multiple group model
N <- nrow(dat)
I <- ncol(dat)
group <- c( rep(1,N/4), rep(2,N/4), rep(3,N/2) )
Q <- matrix(0,nrow=I,ncol=2)
Q[ 1:(I/2), 1] <- Q[ seq(I/2+1,I), 2] <- 1
# estimate model
mod9 <- TAM::tam.mml.3pl(resp=dat, skillspace="normal", est.some.slopes=FALSE,
group=group, Q=Q)
summary(mod9)
#############################################################################
# EXAMPLE 2: Polytomous data
#############################################################################
data( data.mg, package="CDM")
dat <- data.mg[1:1000, paste0("I",1:11)]
#*******************************************************
#*** Model 1: 1-dimensional 1PL estimation, normal skill distribution
mod1 <- TAM::tam.mml.3pl(resp=dat, skillspace="normal",
gammaslope.des="2PL", est.some.slopes=FALSE, est.variance=TRUE )
summary(mod1)
#*******************************************************
#*** Model 2: 1-dimensional 2PL estimation, discrete skill distribution
# define skill space
theta.k <- matrix( seq(-5,5,len=21) )
# allow skew skill distribution
delta.designmatrix <- cbind( 1, theta.k, theta.k^2, theta.k^3 )
# fix 13th xsi item parameter to zero
xsi.fixed <- cbind( 13, 0 )
# fix 10th slope paremeter to one
gammaslope.fixed <- cbind( 10, 1 )
# estimate model
mod2 <- TAM::tam.mml.3pl(resp=dat, skillspace="discrete", theta.k=theta.k,
delta.designmatrix=delta.designmatrix, gammaslope.des="2PL", xsi.fixed=xsi.fixed,
gammaslope.fixed=gammaslope.fixed)
summary(mod2)
#*******************************************************
#*** Model 3: 2-dimensional 2PL estimation, normal skill distribution
# define loading matrix
Q <- matrix(0,11,2)
Q[1:6,1] <- 1 # items 1 to 6 load on dimension 1
Q[7:11,2] <- 1 # items 7 to 11 load on dimension 2
# estimate model
mod3 <- TAM::tam.mml.3pl(resp=dat, gammaslope.des="2PL", Q=Q )
summary(mod3)
#############################################################################
# EXAMPLE 3: Dichotomous data with guessing
#############################################################################
#*** simulate data
set.seed(9765)
N <- 4000 # number of persons
I <- 20 # number of items
b <- seq( -1.5, 1.5, len=I )
guess <- rep(0, I )
guess.items <- c(6,11,16)
guess[ guess.items ] <- .33
library(sirt)
dat <- sirt::sim.raschtype( stats::rnorm(N), b=b, fixed.c=guess )
#*******************************************************
#*** Model 1: Difficulty + guessing model, i.e. fix slopes to 1
est.guess <- rep(0,I)
est.guess[ guess.items ] <- seq(1, length(guess.items) )
# define prior distribution
guess.prior <- matrix( cbind( 5, 17 ), I, 2, byrow=TRUE )
guess.prior[ ! est.guess, ] <- 0
# estimate model
mod1 <- TAM::tam.mml.3pl(resp=dat, guess=guess, est.guess=est.guess,
guess.prior=guess.prior, control=ctl.list,est.variance=TRUE,
est.some.slopes=FALSE )
summary(mod1)
#*******************************************************
#*** Model 2: estimate a joint guessing parameter
est.guess <- rep(0,I)
est.guess[ guess.items ] <- 1
# estimate model
mod2 <- TAM::tam.mml.3pl(resp=dat, guess=guess, est.guess=est.guess,
guess.prior=guess.prior, control=ctl.list,est.variance=TRUE,
est.some.slopes=FALSE )
summary(mod2)
#############################################################################
# EXAMPLE 4: Latent class model with two classes
# See slca Simulated Example 2 in the CDM package
#############################################################################
#*******************************************************
#*** simulate data
set.seed(9876)
I <- 7 # number of items
# simulate response probabilities
a1 <- round( stats::runif(I,0, .4 ),4)
a2 <- round( stats::runif(I, .6, 1 ),4)
N <- 1000 # sample size
# simulate data in two classes of proportions .3 and .7
N1 <- round(.3*N)
dat1 <- 1 * ( matrix(a1,N1,I,byrow=TRUE) > matrix( stats::runif( N1 * I), N1, I ) )
N2 <- round(.7*N)
dat2 <- 1 * ( matrix(a2,N2,I,byrow=TRUE) > matrix( stats::runif( N2 * I), N2, I ) )
dat <- rbind( dat1, dat2 )
colnames(dat) <- paste0("I", 1:I)
#*******************************************************
# estimation using tam.mml.3pl function
# define design matrices
TP <- 2 # two classes
theta.k <- diag(TP) # there are theta dimensions -> two classes
# The idea is that latent classes refer to two different "dimensions".
# Items load on latent class indicators 1 and 2, see below.
E <- array(0, dim=c(I,2,2,2*I) )
items <- colnames(dat)
dimnames(E)[[4]] <- c(paste0( colnames(dat), "Class", 1),
paste0( colnames(dat), "Class", 2) )
# items, categories, classes, parameters
# probabilities for correct solution
for (ii in 1:I){
E[ ii, 2, 1, ii ] <- 1 # probabilities class 1
E[ ii, 2, 2, ii+I ] <- 1 # probabilities class 2
}
# estimation command
mod1 <- TAM::tam.mml.3pl(resp=dat, E=E, control=list(maxit=20), skillspace="discrete",
theta.k=theta.k, notA=TRUE)
summary(mod1)
# compare simulated and estimated data
cbind( mod1$rprobs[,2,1], a2 ) # Simulated class 2
cbind( mod1$rprobs[,2,2], a1 ) # Simulated class 1
#*******************************************************
#** specification with tamaan
tammodel <- "
ANALYSIS:
TYPE=LCA;
NCLASSES(2); # 2 classes
NSTARTS(5,20); # 5 random starts with 20 iterations
LAVAAN MODEL:
F=~ I1__I7
"
mod1b <- TAM::tamaan( tammodel, resp=dat )
summary(mod1b)
# compare with mod1
logLik(mod1)
logLik(mod1b)
#############################################################################
# EXAMPLE 5: Located latent class model, Rasch model
# See slca Simulated Example 4 in the CDM package
#############################################################################
#*** simulate data
set.seed(487)
I <- 15 # I items
b1 <- seq( -2, 2, len=I) # item difficulties
N <- 2000 # number of persons
# simulate 4 theta classes
theta0 <- c( -2.5, -1, 0.3, 1.3 ) # skill classes
probs0 <- c( .1, .4, .2, .3 ) # skill class probabilities
TP <- length(theta0)
theta <- theta0[ rep(1:TP, round(probs0*N) ) ]
library(sirt)
dat <- sirt::sim.raschtype( theta, b1 )
colnames(dat) <- paste0("I",1:I)
#*******************************************************
#*** Model 1: Located latent class model with 4 classes
maxK <- 2
Ngam <- TP
E <- array( 0, dim=c(I, maxK, TP, Ngam ) )
dimnames(E)[[1]] <- colnames(dat)
dimnames(E)[[2]] <- paste0("Cat", 1:(maxK) )
dimnames(E)[[3]] <- paste0("Class", 1:TP)
dimnames(E)[[4]] <- paste0("theta", 1:TP)
# theta design
for (tt in 1:TP){ E[1:I, 2, tt, tt] <- 1 }
theta.k <- diag(TP)
# set eighth item difficulty to zero
xsi.fixed <- cbind( 8, 0 )
# initial gamma parameter
gammaslope <- seq( -1.5, 1.5, len=TP)
# estimate model
mod1 <- TAM::tam.mml.3pl(resp=dat, E=E, xsi.fixed=xsi.fixed,
control=list(maxiter=100), skillspace="discrete",
theta.k=theta.k, gammaslope=gammaslope)
summary(mod1)
# compare estimated and simulated theta class locations
cbind( mod1$gammaslope, theta0 )
# compare estimated and simulated latent class proportions
cbind( mod1$pi.k, probs0 )
#----- specification using tamaan
tammodel <- "
ANALYSIS:
TYPE=LOCLCA;
NCLASSES(4)
LAVAAN MODEL:
F=~ I1__I15
I8 | 0*t1
"
mod1b <- TAM::tamaan( tammodel, resp=dat )
summary(mod1b)
#############################################################################
# EXAMPLE 6: DINA model with two skills
# See slca Simulated Example 5 in the CDM package
#############################################################################
#*** simulate data
set.seed(487)
N <- 3000 # number of persons
# define Q-matrix
I <- 9 # 9 items
NS <- 2 # 2 skills
TP <- 4 # number of skill classes
Q <- scan(nlines=3, text=
"1 0 1 0 1 0
0 1 0 1 0 1
1 1 1 1 1 1",
)
Q <- matrix(Q, I, ncol=NS, byrow=TRUE)
# define skill distribution
alpha0 <- matrix( c(0,0,1,0,0,1,1,1), nrow=4,ncol=2,byrow=TRUE)
prob0 <- c( .2, .4, .1, .3 )
alpha <- alpha0[ rep( 1:TP, prob0*N),]
# define guessing and slipping parameters
guess <- round( stats::runif(I, 0, .4 ), 2 )
slip <- round( stats::runif(I, 0, .3 ), 2 )
# simulate data according to the DINA model
dat <- CDM::sim.din( q.matrix=Q, alpha=alpha, slip=slip, guess=guess )$dat
#*** Model 1: Estimate DINA model
# define E matrix which contains the anti-slipping parameters
maxK <- 2
Ngam <- I
E <- array( 0, dim=c(I, maxK, TP, Ngam ) )
dimnames(E)[[1]] <- colnames(dat)
dimnames(E)[[2]] <- paste0("Cat", 1:(maxK) )
dimnames(E)[[3]] <- c("S00","S10","S01","S11")
dimnames(E)[[4]] <- paste0( "antislip", 1:I )
# define anti-slipping parameters in E
for (ii in 1:I){
# define latent responses
latresp <- 1*( alpha0 %*% Q[ii,]==sum(Q[ii,]) )[,1]
# model slipping parameters
E[ii, 2, latresp==1, ii ] <- 1
}
# skill space definition
theta.k <- diag(TP)
gammaslope <- rep( qlogis( .8 ), I )
# estimate model
mod1 <- TAM::tam.mml.3pl(resp=dat, E=E, control=list(maxiter=100), skillspace="discrete",
theta.k=theta.k, gammaslope=gammaslope)
summary(mod1)
# compare estimated and simulated latent class proportions
cbind( mod1$pi.k, probs0 )
# compare estimated and simulated guessing parameters
cbind( mod1$rprobs[,2,1], guess )
# compare estimated and simulated slipping parameters
cbind( 1 - mod1$rprobs[,2,4], slip )
#############################################################################
# EXAMPLE 7: Mixed Rasch model with two classes
# See slca Simulated Example 3 in the CDM package
#############################################################################
#*** simulate data
set.seed(987)
library(sirt)
# simulate two latent classes of Rasch populations
I <- 15 # 6 items
b1 <- seq( -1.5, 1.5, len=I) # difficulties latent class 1
b2 <- b1 # difficulties latent class 2
b2[ c(4,7, 9, 11, 12, 13) ] <- c(1, -.5, -.5, .33, .33, -.66 )
b2 <- b2 - mean(b2)
N <- 3000 # number of persons
wgt <- .25 # class probability for class 1
# class 1
dat1 <- sirt::sim.raschtype( stats::rnorm( wgt*N ), - b1 )
# class 2
dat2 <- sirt::sim.raschtype( stats::rnorm( (1-wgt)*N, mean=1, sd=1.7), - b2 )
dat <- rbind( dat1, dat2 )
# The idea is that each grid point class x theta is defined as new
# dimension. If we approximate the trait distribution by 7 theta points
# and are interested in estimating 2 latent classes, then we need
# 7*2=14 dimensions.
#*** Model 1: Rasch model
# theta grid
theta.k1 <- seq( -5, 5, len=7 )
TT <- length(theta.k1)
#-- define theta design matrix
theta.k <- diag(TT)
#-- delta designmatrix
delta.designmatrix <- matrix( 0, TT, ncol=3 )
delta.designmatrix[, 1] <- 1
delta.designmatrix[, 2:3] <- cbind( theta.k1, theta.k1^2 )
#-- define loading matrix E
E <- array( 0, dim=c(I,2,TT,I + 1) ) # last parameter is constant 1
for (ii in 1:I){
E[ ii, 2, 1:TT, ii ] <- -1 # '-b' in '1*theta - b'
E[ ii, 2, 1:TT, I+1] <- theta.k1 # '1*theta' in '1*theta - b'
}
# initial gammaslope parameters
par1 <- stats::qlogis( colMeans( dat ) )
gammaslope <- c( par1, 1 )
# sum constraint of zero on item difficulties
gammaslope.constr.V <- matrix( 0, I+1, 1 )
gammaslope.constr.V[ 1:I, 1] <- 1 # Class 1
gammaslope.constr.c <- c(0)
# fixed gammaslope parameter
gammaslope.fixed <- cbind( I+1, 1 )
# estimate model
mod1 <- TAM::tam.mml.3pl(resp=dat1, E=E, skillspace="discrete",
theta.k=theta.k, delta.designmatrix=delta.designmatrix,
gammaslope=gammaslope, gammaslope.constr.V=gammaslope.constr.V,
gammaslope.constr.c=gammaslope.constr.c, gammaslope.fixed=gammaslope.fixed,
notA=TRUE, est.variance=FALSE)
summary(mod1)
#*** Model 2: Mixed Rasch model with two latent classes
# theta grid
theta.k1 <- seq( -4, 4, len=7 )
TT <- length(theta.k1)
#-- define theta design matrix
theta.k <- diag(2*TT) # 2*7=14 classes
#-- delta designmatrix
delta.designmatrix <- matrix( 0, 2*TT, ncol=6 )
# Class 1
delta.designmatrix[1:TT, 1] <- 1
delta.designmatrix[1:TT, 2:3] <- cbind( theta.k1, theta.k1^2 )
# Class 2
delta.designmatrix[TT+1:TT, 4] <- 1
delta.designmatrix[TT+1:TT, 5:6] <- cbind( theta.k1, theta.k1^2 )
#-- define loading matrix E
E <- array( 0, dim=c(I,2,2*TT,2*I + 1) ) # last parameter is constant 1
dimnames(E)[[1]] <- colnames(dat)
dimnames(E)[[2]] <- c("Cat0","Cat1")
dimnames(E)[[3]] <- c( paste0("Class1_theta", 1:TT), paste0("Class2_theta", 1:TT) )
dimnames(E)[[4]] <- c( paste0("b_Class1_", colnames(dat)),
paste0("b_Class2_", colnames(dat)), "One")
for (ii in 1:I){
# Class 1 item parameters
E[ ii, 2, 1:TT, ii ] <- -1 # '-b' in '1*theta - b'
E[ ii, 2, 1:TT, 2*I+1] <- theta.k1 # '1*theta' in '1*theta - b'
# Class 2 item parameters
E[ ii, 2, TT + 1:TT, I + ii ] <- -1
E[ ii, 2, TT + 1:TT, 2*I+1] <- theta.k1
}
# initial gammaslope parameters
par1 <- qlogis( colMeans( dat ) )
gammaslope <- c( par1, par1 + stats::runif(I, -2,2 ), 1 )
# sum constraint of zero on item difficulties within a class
gammaslope.center.index <- c( rep( 1, I ), rep(2,I), 0 )
gammaslope.center.value <- c(0,0)
# estimate model
mod1 <- TAM::tam.mml.3pl(resp=dat, E=E, skillspace="discrete",
theta.k=theta.k, delta.designmatrix=delta.designmatrix,
gammaslope=gammaslope, gammaslope.center.index=gammaslope.center.index,
gammaslope.center.value=gammaslope.center.value, gammaslope.fixed=gammaslope.fixed,
notA=TRUE)
summary(mod1)
# latent class proportions
stats::aggregate( mod1$pi.k, list( rep(1:2, each=TT)), sum )
# compare simulated and estimated item parameters
cbind( b1, b2, - mod1$gammaslope[1:I], - mod1$gammaslope[I + 1:I ] )
#--- specification in tamaan
tammodel <- "
ANALYSIS:
TYPE=MIXTURE;
NCLASSES(2)
NSTARTS(5,30)
LAVAAN MODEL:
F=~ I0001__I0015
ITEM TYPE:
ALL(Rasch);
"
mod1b <- TAM::tamaan( tammodel, resp=dat )
summary(mod1b)
#############################################################################
# EXAMPLE 8: 2PL mixture distribution model
#############################################################################
#*** simulate data
set.seed(9187)
library(sirt)
# simulate two latent classes of Rasch populations
I <- 20
b1 <- seq( -1.5, 1.5, len=I) # difficulties latent class 1
b2 <- b1 # difficulties latent class 2
b2[ c(4,7, 9, 11, 12, 13, 16, 18) ] <- c(1, -.5, -.5, .33, .33, -.66, -1, .3)
# b2 <- scale( b2, scale=FALSE)
b2 <- b2 - mean(b2)
N <- 4000 # number of persons
wgt <- .75 # class probability for class 1
# item slopes
a1 <- rep( 1, I ) # first class
a2 <- rep( c(.5,1.5), I/2 )
# class 1
dat1 <- sirt::sim.raschtype( stats::rnorm( wgt*N ), - b1, fixed.a=a1)
# class 2
dat2 <- sirt::sim.raschtype( stats::rnorm( (1-wgt)*N, mean=1, sd=1.4), - b2, fixed.a=a2)
dat <- rbind( dat1, dat2 )
#*** Model 1: Mixed 2PL model with two latent classes
theta.k1 <- seq( -4, 4, len=7 )
TT <- length(theta.k1)
#-- define theta design matrix
theta.k <- diag(2*TT) # 2*7=14 classes
#-- delta designmatrix
delta.designmatrix <- matrix( 0, 2*TT, ncol=6 )
# Class 1
delta.designmatrix[1:TT, 1] <- 1
delta.designmatrix[1:TT, 2:3] <- cbind( theta.k1, theta.k1^2 )
# Class 2
delta.designmatrix[TT+1:TT, 4] <- 1
delta.designmatrix[TT+1:TT, 5:6] <- cbind( theta.k1, theta.k1^2 )
#-- define loading matrix E
E <- array( 0, dim=c(I,2,2*TT,4*I ) )
dimnames(E)[[1]] <- colnames(dat)
dimnames(E)[[2]] <- c("Cat0","Cat1")
dimnames(E)[[3]] <- c( paste0("Class1_theta", 1:TT), paste0("Class2_theta", 1:TT) )
dimnames(E)[[4]] <- c( paste0("b_Class1_", colnames(dat)),
paste0("a_Class1_", colnames(dat)),
paste0("b_Class2_", colnames(dat)),
paste0("a_Class2_", colnames(dat)) )
for (ii in 1:I){
# Class 1 item parameters
E[ ii, 2, 1:TT, ii ] <- -1 # '-b' in 'a*theta - b'
E[ ii, 2, 1:TT, I + ii] <- theta.k1 # 'a*theta' in 'a*theta - b'
# Class 2 item parameters
E[ ii, 2, TT + 1:TT, 2*I + ii ] <- -1
E[ ii, 2, TT + 1:TT, 3*I + ii ] <- theta.k1
}
# initial gammaslope parameters
par1 <- scale( - stats::qlogis( colMeans( dat ) ), scale=FALSE )
gammaslope <- c( par1, rep(1,I), scale( par1 + runif(I, - 1.4, 1.4 ),
scale=FALSE), stats::runif( I,.6,1.4) )
# constraint matrix
gammaslope.constr.V <- matrix( 0, 4*I, 4 )
# sum of item intercepts equals zero
gammaslope.constr.V[ 1:I, 1] <- 1 # Class 1 (b)
gammaslope.constr.V[ 2*I + 1:I, 2] <- 1 # Class 2 (b)
# sum of item slopes equals number of items -> mean slope of 1
gammaslope.constr.V[ I + 1:I, 3] <- 1 # Class 1 (a)
gammaslope.constr.V[ 3*I + 1:I, 4] <- 1 # Class 2 (a)
gammaslope.constr.c <- c(0,0,I,I)
# estimate model
mod1 <- TAM::tam.mml.3pl(resp=dat, E=E, control=list(maxiter=80), skillspace="discrete",
theta.k=theta.k, delta.designmatrix=delta.designmatrix,
gammaslope=gammaslope, gammaslope.constr.V=gammaslope.constr.V,
gammaslope.constr.c=gammaslope.constr.c, gammaslope.fixed=gammaslope.fixed,
notA=TRUE)
# estimated item parameters
mod1$gammaslope
# summary
summary(mod1)
# latent class proportions
round( stats::aggregate( mod1$pi.k, list( rep(1:2, each=TT)), sum ), 3 )
# compare simulated and estimated item intercepts
int <- cbind( b1*a1, b2 * a2, - mod1$gammaslope[1:I], - mod1$gammaslope[2*I + 1:I ] )
round( int, 3 )
# simulated and estimated item slopes
slo <- cbind( a1, a2, mod1$gammaslope[I+1:I], mod1$gammaslope[3*I + 1:I ] )
round(slo,3)
#--- specification in tamaan
tammodel <- "
ANALYSIS:
TYPE=MIXTURE;
NCLASSES(2)
NSTARTS(10,25)
LAVAAN MODEL:
F=~ I0001__I0020
"
mod1t <- TAM::tamaan( tammodel, resp=dat )
summary(mod1t)
#############################################################################
# EXAMPLE 9: Toy example: Exact representation of an item by a factor
#############################################################################
data(data.gpcm)
dat <- data.gpcm[,1,drop=FALSE ] # choose first item
# some descriptives
( t1 <- table(dat) )
# The idea is that we define an IRT model with one latent variable
# which extactly corresponds to the manifest item.
I <- 1 # 1 item
K <- 4 # 4 categories
TP <- 4 # 4 discrete theta points
# define skill space
theta.k <- diag(TP)
# define loading matrix E
E <- array( -99, dim=c(I,K,TP,1 ) )
for (vv in 1:K){
E[ 1, vv, vv, 1 ] <- 9
}
# estimate model
mod1 <- TAM::tam.mml.3pl(resp=dat, E=E, skillspace="discrete",
theta.k=theta.k, notA=TRUE)
summary(mod1)
# -> the latent distribution corresponds to the manifest distribution, because ...
round( mod1$pi.k, 3 )
round( t1 / sum(t1), 3 )
#############################################################################
# EXAMPLE 10: Some fixed item loadings
#############################################################################
data(data.Students,package="CDM")
dat <- data.Students
# select variables
vars <- scan( nlines=1, what="character")
act1 act2 act3 act4 act5 sc1 sc2 sc3 sc4
dat <- data.Students[, vars ]
# define loading matrix: two-dimensional model
Q <- matrix( 0, nrow=9, ncol=2 )
Q[1:5,1] <- 1
Q[6:9,2] <- 1
# define some fixed item loadings
Q.fixed <- NA*Q
Q.fixed[ c(1,4), 1] <- .5
Q.fixed[ 6:7, 2 ] <- 1
# estimate model
mod3 <- TAM::tam.mml.3pl( resp=dat, gammaslope.des="2PL", Q=Q, Q.fixed=Q.fixed,
control=list( maxiter=10, nodes=seq(-4,4,len=10) ) )
summary(mod3)
#############################################################################
# EXAMPLE 11: Mixed response formats - Multiple choice and partial credit items
#############################################################################
data(data.timssAusTwn.scored)
dat <- data.timssAusTwn.scored
# select columns with item responses
dat <- dat[, grep("M0", colnames(dat) ) ]
I <- ncol(dat) # number of items
# The idea is to start with partial credit modelling
# and then to include the guessing parameters
#*** Model 0: Partial Credit Model
mod0 <- TAM::tam.mml(dat)
summary(mod0)
#*** Model 1 and Model 2: include guessing parameters
# multiple choice items
guess_items <- which( apply( dat, 2, max, na.rm=TRUE )==1 )
# define guessing parameters
guess0 <- rep(0,I)
guess0[ guess_items ] <- .25 # set guessing probability to .25
# define which guessing parameters should be estimated
est.guess1 <- rep(0,I) # all parameters are fixed
est.guess2 <- 1 * ( guess0==.25 ) # joint guessing parameter
# use design matrix from partial credit model
A0 <- mod0$A
#--- Model 1: fixed guessing parameters of .25 and item slopes of 1
mod1 <- TAM::tam.mml.3pl( dat, guess=guess0, est.guess=est.guess1,
A=A0, est.some.slopes=FALSE, control=list(maxiter=50) )
summary(mod1)
#--- Model 2: estimate joint guessing parameters and item slopes of 1
mod2 <- TAM::tam.mml.3pl( dat, guess=guess0, est.guess=est.guess2,
A=A0, est.some.slopes=FALSE, control=list(maxiter=50) )
summary(mod2)
# model comparison
IRT.compareModels(mod0,mod1,mod2)
## End(Not run)