rm.sdt {sirt}R Documentation

Hierarchical Rater Model Based on Signal Detection Theory (HRM-SDT)

Description

This function estimates a version of the hierarchical rater model (HRM) based on signal detection theory (HRM-SDT; DeCarlo, 2005; DeCarlo, Kim & Johnson, 2011; Robitzsch & Steinfeld, 2018). The model is estimated by means of an EM algorithm adapted from multilevel latent class analysis (Vermunt, 2008).

Usage

rm.sdt(dat, pid, rater, Qmatrix=NULL, theta.k=seq(-9, 9, len=30),
    est.a.item=FALSE, est.c.rater="n", est.d.rater="n", est.mean=FALSE, est.sigma=TRUE,
    skillspace="normal", tau.item.fixed=NULL, a.item.fixed=NULL,
    d.min=0.5, d.max=100, d.start=3, c.start=NULL, tau.start=NULL, sd.start=1,
    d.prior=c(3,100), c.prior=c(3,100), tau.prior=c(0,1000), a.prior=c(1,100),
    link_item="GPCM", max.increment=1, numdiff.parm=0.00001, maxdevchange=0.1,
    globconv=.001, maxiter=1000, msteps=4, mstepconv=0.001, optimizer="nlminb" )

## S3 method for class 'rm.sdt'
summary(object, file=NULL, ...)

## S3 method for class 'rm.sdt'
plot(x, ask=TRUE, ...)

## S3 method for class 'rm.sdt'
anova(object,...)

## S3 method for class 'rm.sdt'
logLik(object,...)

## S3 method for class 'rm.sdt'
IRT.factor.scores(object, type="EAP", ...)

## S3 method for class 'rm.sdt'
IRT.irfprob(object,...)

## S3 method for class 'rm.sdt'
IRT.likelihood(object,...)

## S3 method for class 'rm.sdt'
IRT.posterior(object,...)

## S3 method for class 'rm.sdt'
IRT.modelfit(object,...)

## S3 method for class 'IRT.modelfit.rm.sdt'
summary(object,...)

Arguments

dat

Original data frame. Ratings on variables must be in rows, i.e. every row corresponds to a person-rater combination.

pid

Person identifier.

rater

Rater identifier.

Qmatrix

An optional Q-matrix. If this matrix is not provided, then by default the ordinary scoring of categories (from 0 to the maximum score of KK) is used.

theta.k

A grid of theta values for the ability distribution.

est.a.item

Should item parameters aia_i be estimated?

est.c.rater

Type of estimation for item-rater parameters circ_{ir} in the signal detection model. Options are 'n' (no estimation), 'e' (set all parameters equal to each other), 'i' (itemwise estimation), 'r' (rater wise estimation) and 'a' (all parameters are estimated independently from each other).

est.d.rater

Type of estimation of dd parameters. Options are the same as in est.c.rater.

est.mean

Optional logical indicating whether the mean of the trait distribution should be estimated.

est.sigma

Optional logical indicating whether the standard deviation of the trait distribution should be estimated.

skillspace

Specified θ\theta distribution type. It can be "normal" or "discrete". In the latter case, all probabilities of the distribution are separately estimated.

tau.item.fixed

Optional matrix with three columns specifying fixed τ\tau parameters. The first two columns denote item and category indices, the third the fixed value. See Example 3.

a.item.fixed

Optional matrix with two columns specifying fixed aa parameters. First column: Item index. Second column: Fixed aa parameter.

d.min

Minimal dd parameter to be estimated

d.max

Maximal dd parameter to be estimated

d.start

Starting value(s) of dd parameters

c.start

Starting values of cc parameters

tau.start

Starting values of τ\tau parameters

sd.start

Starting value for trait standard deviation

d.prior

Normal prior N(M,S2)N(M,S^2) for dd parameters

c.prior

Normal prior for cc parameters. The prior for parameter cirkc_{irk} is defined as M(k0.5)M \cdot ( k - 0.5) where MM is c.prior[1].

tau.prior

Normal prior for τ\tau parameters

a.prior

Normal prior for aa parameters

link_item

Type of item response function for latent responses. Can be "GPCM" for the generalized partial credit model or "GRM" for the graded response model.

max.increment

Maximum increment of item parameters during estimation

numdiff.parm

Numerical differentiation step width

maxdevchange

Maximum relative deviance change as a convergence criterion

globconv

Maximum parameter change

maxiter

Maximum number of iterations

msteps

Maximum number of iterations during an M step

mstepconv

Convergence criterion in an M step

optimizer

Choice of optimization function in M-step for item parameters. Options are "nlminb" for stats::nlminb and "optim" for stats::optim.

object

Object of class rm.sdt

file

Optional file name in which summary should be written.

x

Object of class rm.sdt

ask

Optional logical indicating whether a new plot should be asked for.

type

Factor score estimation method. Up to now, only type="EAP" is supported.

...

Further arguments to be passed

Details

The specification of the model follows DeCarlo et al. (2011). The second level models the ideal rating (latent response) η=0,...,K\eta=0, ...,K of person pp on item ii. The option link_item='GPCM' follows the generalized partial credit model

P(ηpi=ηθp)exp(aiqiηθpτiη) P( \eta_{pi}=\eta | \theta_p ) \propto exp( a_{i} q_{i \eta } \theta_p - \tau_{i \eta } )

. The option link_item='GRM' employs the graded response model

P(ηpi=ηθp)=Ψ(τi,η+1aiθp)Ψ(τi,ηaiθp) P( \eta_{pi}=\eta | \theta_p )= \Psi( \tau_{i,\eta + 1} - a_i \theta_p ) - \Psi( \tau_{i,\eta} - a_i \theta_p )

At the first level, the ratings XpirX_{pir} for person pp on item ii and rater rr are modeled as a signal detection model

P(Xpirkηpi)=G(cirkdirηpi) P( X_{pir} \le k | \eta_{pi} )= G( c_{irk} - d_{ir} \eta_{pi} )

where GG is the logistic distribution function and the categories are k=1,,K+1k=1,\ldots, K+1. Note that the item response model can be equivalently written as

P(Xpirkηpi)=G(dirηpicirk) P( X_{pir} \ge k | \eta_{pi} )= G( d_{ir} \eta_{pi} - c_{irk})

The thresholds cirkc_{irk} can be further restricted to cirk=ckc_{irk}=c_{k} (est.c.rater='e'), cirk=cikc_{irk}=c_{ik} (est.c.rater='i') or cirk=circ_{irk}=c_{ir} (est.c.rater='r'). The same holds for rater precision parameters dird_{ir}.

Value

A list with following entries:

deviance

Deviance

ic

Information criteria and number of parameters

item

Data frame with item parameters. The columns N and M denote the number of observed ratings and the observed mean of all ratings, respectively.
In addition to item parameters τik\tau_{ik} and aia_i, the mean for the latent response (latM) is computed as E(ηi)=pP(θp)qikP(ηi=kθp)E( \eta_i )=\sum_p P( \theta_p ) q_{ik} P( \eta_i=k | \theta_p ) which provides an item parameter at the original metric of ratings. The latent standard deviation (latSD) is computed in the same manner.

rater

Data frame with rater parameters. Transformed cc parameters (c_x.trans) are computed as cirk/(dir)c_{irk} / ( d_{ir} ).

person

Data frame with person parameters: EAP and corresponding standard errors

EAP.rel

EAP reliability

EAP.rel

EAP reliability

mu

Mean of the trait distribution

sigma

Standard deviation of the trait distribution

tau.item

Item parameters τik\tau_{ik}

se.tau.item

Standard error of item parameters τik\tau_{ik}

a.item

Item slopes aia_i

se.a.item

Standard error of item slopes aia_i

c.rater

Rater parameters cirkc_{irk}

se.c.rater

Standard error of rater severity parameter cirkc_{irk}

d.rater

Rater slope parameter dird_{ir}

se.d.rater

Standard error of rater slope parameter dird_{ir}

f.yi.qk

Individual likelihood

f.qk.yi

Individual posterior distribution

probs

Item probabilities at grid theta.k. Note that these probabilities are calculated on the pseudo items i×ri \times r, i.e. the interaction of item and rater.

prob.item

Probabilities P(ηi=ηθ)P( \eta_i=\eta | \theta ) of latent item responses evaluated at theta grid θp\theta_p.

n.ik

Expected counts

pi.k

Estimated trait distribution P(θp)P(\theta_p).

maxK

Maximum number of categories

procdata

Processed data

iter

Number of iterations

...

Further values

References

DeCarlo, L. T. (2005). A model of rater behavior in essay grading based on signal detection theory. Journal of Educational Measurement, 42, 53-76.

DeCarlo, L. T. (2010). Studies of a latent-class signal-detection model for constructed response scoring II: Incomplete and hierarchical designs. ETS Research Report ETS RR-10-08. Princeton NJ: ETS.

DeCarlo, T., Kim, Y., & Johnson, M. S. (2011). A hierarchical rater model for constructed responses, with a signal detection rater model. Journal of Educational Measurement, 48, 333-356.

Robitzsch, A., & Steinfeld, J. (2018). Item response models for human ratings: Overview, estimation methods, and implementation in R. Psychological Test and Assessment Modeling, 60(1), 101-139.

Vermunt, J. K. (2008). Latent class and finite mixture models for multilevel data sets. Statistical Methods in Medical Research, 17, 33-51.

See Also

The facets rater model can be estimated with rm.facets.

Examples

#############################################################################
# EXAMPLE 1: Hierarchical rater model (HRM-SDT) data.ratings1
#############################################################################
data(data.ratings1)
dat <- data.ratings1

## Not run: 
# Model 1: Partial Credit Model: no rater effects
mod1 <- sirt::rm.sdt( dat[, paste0( "k",1:5) ], rater=dat$rater,
            pid=dat$idstud, est.c.rater="n", d.start=100,  est.d.rater="n" )
summary(mod1)

# Model 2: Generalized Partial Credit Model: no rater effects
mod2 <- sirt::rm.sdt( dat[, paste0( "k",1:5) ], rater=dat$rater,
            pid=dat$idstud, est.c.rater="n", est.d.rater="n",
            est.a.item=TRUE, d.start=100)
summary(mod2)

# Model 3: Equal effects in SDT
mod3 <- sirt::rm.sdt( dat[, paste0( "k",1:5) ], rater=dat$rater,
            pid=dat$idstud, est.c.rater="e", est.d.rater="e")
summary(mod3)

# Model 4: Rater effects in SDT
mod4 <- sirt::rm.sdt( dat[, paste0( "k",1:5) ], rater=dat$rater,
            pid=dat$idstud, est.c.rater="r", est.d.rater="r")
summary(mod4)

#############################################################################
# EXAMPLE 2: HRM-SDT data.ratings3
#############################################################################

data(data.ratings3)
dat <- data.ratings3
dat <- dat[ dat$rater < 814, ]
psych::describe(dat)

# Model 1: item- and rater-specific effects
mod1 <- sirt::rm.sdt( dat[, paste0( "crit",c(2:4)) ], rater=dat$rater,
            pid=dat$idstud, est.c.rater="a", est.d.rater="a" )
summary(mod1)
plot(mod1)

# Model 2: Differing number of categories per variable
mod2 <- sirt::rm.sdt( dat[, paste0( "crit",c(2:4,6)) ], rater=dat$rater,
            pid=dat$idstud, est.c.rater="a", est.d.rater="a")
summary(mod2)
plot(mod2)

#############################################################################
# EXAMPLE 3: Hierarchical rater model with discrete skill spaces
#############################################################################

data(data.ratings3)
dat <- data.ratings3
dat <- dat[ dat$rater < 814, ]
psych::describe(dat)

# Model 1: Discrete theta skill space with values of 0,1,2 and 3
mod1 <- sirt::rm.sdt( dat[, paste0( "crit",c(2:4)) ], theta.k=0:3, rater=dat$rater,
            pid=dat$idstud, est.c.rater="a", est.d.rater="a", skillspace="discrete" )
summary(mod1)
plot(mod1)

# Model 2: Modelling of one item by using a discrete skill space and
#          fixed item parameters

# fixed tau and a parameters
tau.item.fixed <- cbind( 1, 1:3,  100*cumsum( c( 0.5, 1.5, 2.5)) )
a.item.fixed <- cbind( 1, 100 )
# fit HRM-SDT
mod2 <- sirt::rm.sdt( dat[, "crit2", drop=FALSE], theta.k=0:3, rater=dat$rater,
            tau.item.fixed=tau.item.fixed,a.item.fixed=a.item.fixed, pid=dat$idstud,
            est.c.rater="a", est.d.rater="a", skillspace="discrete" )
summary(mod2)
plot(mod2)

## End(Not run)

[Package sirt version 4.1-15 Index]