lsdm {sirt}R Documentation

Least Squares Distance Method of Cognitive Validation

Description

This function estimates the least squares distance method of cognitive validation (Dimitrov, 2007; Dimitrov & Atanasov, 2012) which assumes a multiplicative relationship of attribute response probabilities to explain item response probabilities. The argument distance allows the estimation of a squared loss function (distance="L2") and an absolute value loss function (distance="L1").

The function also estimates the classical linear logistic test model (LLTM; Fischer, 1973) which assumes a linear relationship for item difficulties in the Rasch model.

Usage

lsdm(data, Qmatrix, theta=seq(-3,3,by=.5), wgt_theta=rep(1, length(theta)), distance="L2",
   quant.list=c(0.5,0.65,0.8), b=NULL, a=rep(1,nrow(Qmatrix)), c=rep(0,nrow(Qmatrix)) )

## S3 method for class 'lsdm'
summary(object, file=NULL, digits=3, ...)

## S3 method for class 'lsdm'
plot(x, ...)

Arguments

data

An I \times L matrix of dichotomous item responses. The data consists of I item response functions (parametrically or nonparametrically estimated) which are evaluated at a discrete grid of L theta values (person parameters) and are specified in the argument theta.

Qmatrix

An I \times K matrix where the allocation of items to attributes is coded. Values of zero and one and all values between zero and one are permitted. There must not be any items with only zero Q-matrix entries in a row.

theta

The discrete grid points \theta where item response functions are evaluated for doing the LSDM method.

wgt_theta

Optional vector for weights of discrete \theta points

quant.list

A vector of quantiles where attribute response functions are evaluated.

distance

Type of distance function for minimizing the discrepancy between observed and expected item response functions. Options are "L2" which is the squared distance (proposed in the original LSDM formulation in Dimitrov, 2007) and the absolute value distance "L1" (see Details).

b

An optional vector of item difficulties. If it is specified, then no data input is necessary.

a

An optional vector of item discriminations.

c

An optional vector of guessing parameters.

object

Object of class lsdm

file

Optional file name for summary output

digits

Number of digits aftert decimal in summary

...

Further arguments to be passed

x

Object of class lsdm

Details

The least squares distance method (LSDM; Dimitrov 2007) is based on the assumption that estimated item response functions P(X_i=1 | \theta) can be decomposed in a multiplicative way (in the implemented conjunctive model):

P( X_i=1 | \theta ) \approx \prod_{k=1}^K [ P( A_k=1 | \theta ) ]^{q_{ik}}

where P( A_k=1 | \theta ) are attribute response functions and q_{ik} are entries of the Q-matrix. Note that the multiplicative form can be rewritten by taking the logarithm

\log P( X_i=1 | \theta ) \approx \sum_{k=1}^K q_{ik} \log [ P( A_k=1 | \theta ) ]

The item and attribute response functions are evaluated on a grid of \theta values. Using the definitions of matrices \bold{L}=\{ \log P( X_i=1 ) | \theta ) \} , \bold{Q}=\{ q_{ik} \} and \bold{X}=\{ \log P( A_k=1 | \theta ) \} , the estimation problem can be formulated as \bold{L} \approx \bold{Q} \bold{X}. Two different loss functions for minimizing the discrepancy between \bold{L} and \bold{Q} \bold{X} are implemented. First, the squared loss function computes the weighted difference || \bold{L} - \bold{Q} \bold{X}||_2=\sum_i ( l_i - \sum_t q_{it} x_{it})^2 (distance="L2") and has been originally proposed by Dimitrov (2007). Second, the absolute value loss function || \bold{L} - \bold{Q} \bold{X}||_1=\sum_i | l_i - \sum_t q_{it} x_{it} | (distance="L1") is more robust to outliers (i.e., items which show misfit to the assumed multiplicative LSDM formulation).

After fitting the attribute response functions, empirical item-attribute discriminations w_{ik} are calculated as the approximation of the following equation

\log P( X_i=1 | \theta )= \sum_{k=1}^K w_{ik} q_{ik} \log [ P( A_k=1 | \theta ) ]

Value

A list with following entries

mean.mad.lsdm0

Mean of MAD statistics for LSDM

mean.mad.lltm

Mean of MAD statistics for LLTM

attr.curves

Estimated attribute response curves evaluated at theta

attr.pars

Estimated attribute parameters for LSDM and LLTM

data.fitted

LSDM-fitted item response functions evaluated at theta

theta

Grid of ability distributions at which functions are evaluated

item

Item statistics (p value, MAD, ...)

data

Estimated or fixed item response functions evaluated at theta

Qmatrix

Used Q-matrix

lltm

Model output of LLTM (lm values)

W

Matrix with empirical item-attribute discriminations

References

Al-Shamrani, A., & Dimitrov, D. M. (2016). Cognitive diagnostic analysis of reading comprehension items: The case of English proficiency assessment in Saudi Arabia. International Journal of School and Cognitive Psychology, 4(3). 1000196. http://dx.doi.org/10.4172/2469-9837.1000196

DiBello, L. V., Roussos, L. A., & Stout, W. F. (2007). Review of cognitively diagnostic assessment and a summary of psychometric models. In C. R. Rao and S. Sinharay (Eds.), Handbook of Statistics, Vol. 26 (pp. 979-1030). Amsterdam: Elsevier.

Dimitrov, D. M. (2007). Least squares distance method of cognitive validation and analysis for binary items using their item response theory parameters. Applied Psychological Measurement, 31, 367-387. http://dx.doi.org/10.1177/0146621606295199

Dimitrov, D. M., & Atanasov, D. V. (2012). Conjunctive and disjunctive extensions of the least squares distance model of cognitive diagnosis. Educational and Psychological Measurement, 72, 120-138. http://dx.doi.org/10.1177/0013164411402324

Dimitrov, D. M., Gerganov, E. N., Greenberg, M., & Atanasov, D. V. (2008). Analysis of cognitive attributes for mathematics items in the framework of Rasch measurement. AERA 2008, New York.

Fischer, G. H. (1973). The linear logistic test model as an instrument in educational research. Acta Psychologica, 37, 359-374. http://dx.doi.org/10.1016/0001-6918(73)90003-6

Sonnleitner, P. (2008). Using the LLTM to evaluate an item-generating system for reading comprehension. Psychology Science, 50, 345-362.

See Also

Get a summary of the LSDM analysis with summary.lsdm.

See the CDM package for the estimation of related cognitive diagnostic models (DiBello, Roussos & Stout, 2007).

Examples

#############################################################################
# EXAMPLE 1: Dataset Fischer (see Dimitrov, 2007)
#############################################################################

# item difficulties
b <- c( 0.171,-1.626,-0.729,0.137,0.037,-0.787,-1.322,-0.216,1.802,
    0.476,1.19,-0.768,0.275,-0.846,0.213,0.306,0.796,0.089,
    0.398,-0.887,0.888,0.953,-1.496,0.905,-0.332,-0.435,0.346,
    -0.182,0.906)
# read Q-matrix
Qmatrix <- c( 1,1,0,1,0,0,0,0,1,0,1,0,0,0,0,0,1,0,0,1,0,0,0,0,
    1,0,1,1,0,0,0,0,1,0,0,1,0,0,0,0,0,1,0,0,1,1,0,0,1,0,1,0,1,0,0,0,
    1,0,1,0,1,1,0,0,1,0,1,1,0,1,0,0,1,0,0,1,0,1,0,0,1,0,1,1,1,0,0,0,
    1,0,0,1,0,0,1,0,1,0,0,1,0,0,1,0,1,0,1,0,0,0,1,0,1,1,0,1,0,1,1,0,
    1,0,1,1,0,0,1,0,1,0,0,1,0,0,0,1,1,0,1,1,0,0,0,1,1,0,0,1,0,0,0,1,
    0,1,0,0,0,1,0,1,1,1,0,1,0,1,0,1,1,0,0,1,0,1,0,0,1,1,0,0,1,0,0,0,
    1,0,0,1,1,0,0,0,1,1,0,1,0,0,0,0,1,0,1,1,0,0,0,0,1,0,1,1,0,1,0,0,
    1,1,0,1,0,0,0,0,1,0,1,1,1,1,0,0 )
Qmatrix <- matrix( Qmatrix, nrow=29, byrow=TRUE )
colnames(Qmatrix) <- paste("A",1:8,sep="")
rownames(Qmatrix) <- paste("Item",1:29,sep="")

#* Model 1: perform a LSDM analysis with defaults
mod1 <- sirt::lsdm( b=b, Qmatrix=Qmatrix )
summary(mod1)
plot(mod1)

#* Model 2: different theta values and weights
theta <- seq(-4,4,len=31)
wgt_theta <- stats::dnorm(theta)
mod2 <- sirt::lsdm( b=b, Qmatrix=Qmatrix, theta=theta, wgt_theta=wgt_theta )
summary(mod2)

#* Model 3: absolute value distance function
mod3 <- sirt::lsdm( b=b, Qmatrix=Qmatrix, distance="L1" )
summary(mod3)

#############################################################################
# EXAMPLE 2: Dataset Henning (see Dimitrov, 2007)
#############################################################################

# item difficulties
b <- c(-2.03,-1.29,-1.03,-1.58,0.59,-1.65,2.22,-1.46,2.58,-0.66)
# item slopes
a <- c(0.6,0.81,0.75,0.81,0.62,0.75,0.54,0.65,0.75,0.54)
# define Q-matrix
Qmatrix <- c(1,0,0,0,0,0,1,0,0,0,0,1,0,1,0,0,1,0,0,0,0,1,1,0,0,
    0,0,0,1,0,0,1,0,0,1,0,0,0,1,0,0,0,0,1,1,1,0,1,0,0 )
Qmatrix <- matrix( Qmatrix, nrow=10, byrow=TRUE )
colnames(Qmatrix) <- paste("A",1:5,sep="")
rownames(Qmatrix) <- paste("Item",1:10,sep="")

# LSDM analysis
mod <- sirt::lsdm( b=b, a=a, Qmatrix=Qmatrix )
summary(mod)

## Not run: 
#############################################################################
# EXAMPLE 3: PISA reading (data.pisaRead)
#    using nonparametrically estimated item response functions
#############################################################################

data(data.pisaRead)
# response data
dat <- data.pisaRead$data
dat <- dat[, substring( colnames(dat),1,1)=="R" ]
# define Q-matrix
pars <- data.pisaRead$item
Qmatrix <- data.frame(  "A0"=1*(pars$ItemFormat=="MC" ),
                  "A1"=1*(pars$ItemFormat=="CR" ) )

# start with estimating the 1PL in order to get person parameters
mod <- sirt::rasch.mml2( dat )
theta <- sirt::wle.rasch( dat=dat,b=mod$item$b )$theta
# Nonparametric estimation of item response functions
mod2 <- sirt::np.dich( dat=dat, theta=theta, thetagrid=seq(-3,3,len=100) )

# LSDM analysis
lmod <- sirt::lsdm( data=mod2$estimate, Qmatrix=Qmatrix, theta=mod2$thetagrid)
summary(lmod)
plot(lmod)

#############################################################################
# EXAMPLE 4: Fraction subtraction dataset
#############################################################################

data( data.fraction1, package="CDM")
data <- data.fraction1$data
q.matrix <- data.fraction1$q.matrix

#****
# Model 1: 2PL estimation
mod1 <- sirt::rasch.mml2( data, est.a=1:nrow(q.matrix) )

# LSDM analysis
lmod1 <- sirt::lsdm( b=mod1$item$b, a=mod1$item$a, Qmatrix=q.matrix )
summary(lmod1)

#****
# Model 2: 1PL estimation
mod2 <- sirt::rasch.mml2(data)

# LSDM analysis
lmod2 <- sirt::lsdm( b=mod1$item$b, Qmatrix=q.matrix )
summary(lmod2)

#############################################################################
# EXAMPLE 5: Dataset LLTM Sonnleitner Reading Comprehension (Sonnleitner, 2008)
#############################################################################

# item difficulties Table 7, p. 355 (Sonnleitner, 2008)
b <- c(-1.0189,1.6754,-1.0842,-.4457,-1.9419,-1.1513,2.0871,2.4874,-1.659,-1.197,-1.2437,
    2.1537,.3301,-.5181,-1.3024,-.8248,-.0278,1.3279,2.1454,-1.55,1.4277,.3301)
b <- b[-21] # remove Item 21

# Q-matrix Table 9, p. 357 (Sonnleitner, 2008)
Qmatrix <- scan()
   1 0 0 0 0 0 0 7 4 0 0 0   0 1 0 0 0 0 0 5 1 0 0 0   1 1 0 1 0 0 0 9 1 0 1 0
   1 1 1 0 0 0 0 5 2 0 1 0   1 1 0 0 1 0 0 7 5 1 1 0   1 1 0 0 0 0 0 7 3 0 0 0
   0 1 0 0 0 0 2 6 1 0 0 0   0 0 0 0 0 0 2 6 1 0 0 0   1 0 0 0 0 0 1 7 4 1 0 0
   0 1 0 0 0 0 0 6 2 1 1 0   0 1 0 0 0 1 0 7 3 1 0 0   0 1 0 0 0 0 0 5 1 0 0 0
   0 0 0 0 0 1 0 4 1 0 0 1   0 0 0 0 0 0 0 6 1 0 1 1   0 0 1 0 0 0 0 6 3 0 1 1
   0 0 0 1 0 0 1 7 5 0 0 1   0 1 0 0 0 0 1 2 2 0 0 1   0 1 1 0 0 0 1 4 1 0 0 1
   0 1 0 0 1 0 0 5 1 0 0 1   0 1 0 0 0 0 1 7 2 0 0 1   0 0 0 0 0 1 0 5 1 0 0 1

Qmatrix <- matrix( as.numeric(Qmatrix), nrow=21, ncol=12, byrow=TRUE )
colnames(Qmatrix) <- scan( what="character", nlines=1)
   pc ic ier inc iui igc ch nro ncro td a t

# divide Q-matrix entries by maximum in each column
Qmatrix <- round(Qmatrix / matrix(apply(Qmatrix,2,max),21,12,byrow=TRUE),3)
# LSDM analysis
mod <- sirt::lsdm( b=b, Qmatrix=Qmatrix )
summary(mod)

#############################################################################
# EXAMPLE 6: Dataset Dimitrov et al. (2008)
#############################################################################

Qmatrix <- scan()
1 0 0 0 1 1 0 1 1 0 0 0 1 0 0 0 1 1 0 0 1 1 0 1 0 0 1 1 1 0 1 0 0 0 1 0

Qmatrix <- matrix(Qmatrix, ncol=4, byrow=TRUE)
colnames(Qmatrix) <- paste0("A",1:4)
rownames(Qmatrix) <- paste0("I",1:9)

b <- scan()
0.068 1.095 -0.641 -1.129 -0.061 1.218 1.244 -0.648 -1.146

# estimate model
mod <- sirt::lsdm( b=b, Qmatrix=Qmatrix )
summary(mod)
plot(mod)

#############################################################################
# EXAMPLE 7: Dataset Al-Shamrani & Dimitrov et al. (2017)
#############################################################################

I <- 39  # number of items

Qmatrix <- scan()
0 0 0 1 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0
0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0
0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0
0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0
0 1 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 1 0

Qmatrix <- matrix(Qmatrix, nrow=I, byrow=TRUE)
colnames(Qmatrix) <- paste0("A",1:7)
rownames(Qmatrix) <- paste0("I",1:I)

pars <- scan()
1.952 0.9833 0.1816 1.1053 0.9631 0.1653 1.3904 1.3208 0.2545 0.7391 1.9367 0.2083 2.0833
1.8627 0.1873 1.4139 1.0107 0.2454 0.8274 0.9913 0.2137 1.0338 -0.0068 0.2368 2.4803
0.7939 0.1997 1.4867 1.1705 0.2541 1.4482 1.4176 0.2889 1.0789 0.8062 0.269 1.6258 1.1739
0.1723 1.5995 1.0936 0.2054 1.1814 1.0909 0.2623 2.0389 1.5023 0.2466 1.3636 1.1485 0.2059
1.8468 1.2755 0.192 1.9461 1.4947 0.2001 1.194 0.0889 0.2275 1.2114 0.8925 0.2367 2.0912
0.5961 0.2036 2.5769 1.3014 0.186 1.4554 1.2529 0.2423 1.4919 0.4763 0.2482 2.6787 1.7069
0.1796 1.5611 1.3991 0.2312 1.4353 0.678 0.1851 0.9127 1.3523 0.2525 0.6886 -0.3652 0.207
0.7039 -0.2494 0.2315 1.3683 0.8953 0.2326 1.4992 0.1025 0.2403 1.0727 0.2591 0.2152
1.3854 1.3802 0.2448 0.7748 0.4304 0.184 1.0218 1.8964 0.1949 1.5773 1.8934 0.2231 0.8631
1.4145 0.2132

pars <- matrix(pars, nrow=I, byrow=TRUE)
colnames(pars) <- c("a","b","c")
rownames(pars) <- paste0("I",1:I)
pars <- as.data.frame(pars)

#* Model 1: fit LSDM to 3PL curves (as in Al-Shamrani)
mod1 <- sirt::lsdm(b=pars$b, a=pars$a, c=pars$c, Qmatrix=Qmatrix)
summary(mod1)
plot(mod1)

#* Model 2: fit LSDM to 2PL curves
mod2 <- sirt::lsdm(b=pars$b, a=pars$a, Qmatrix=Qmatrix)
summary(mod2)
plot(mod2)

## End(Not run)

[Package sirt version 4.1-15 Index]