R: Hierarchical Rater Model Based on Signal Detection Theory...

rm.sdt {sirt}

R Documentation

Hierarchical Rater Model Based on Signal Detection Theory (HRM-SDT)

Description

This function estimates a version of the hierarchical rater model (HRM) based on signal detection theory (HRM-SDT; DeCarlo, 2005; DeCarlo, Kim & Johnson, 2011; Robitzsch & Steinfeld, 2018). The model is estimated by means of an EM algorithm adapted from multilevel latent class analysis (Vermunt, 2008).

Usage

rm.sdt(dat, pid, rater, Qmatrix=NULL, theta.k=seq(-9, 9, len=30),
    est.a.item=FALSE, est.c.rater="n", est.d.rater="n", est.mean=FALSE, est.sigma=TRUE,
    skillspace="normal", tau.item.fixed=NULL, a.item.fixed=NULL,
    d.min=0.5, d.max=100, d.start=3, c.start=NULL, tau.start=NULL, sd.start=1,
    d.prior=c(3,100), c.prior=c(3,100), tau.prior=c(0,1000), a.prior=c(1,100),
    link_item="GPCM", max.increment=1, numdiff.parm=0.00001, maxdevchange=0.1,
    globconv=.001, maxiter=1000, msteps=4, mstepconv=0.001, optimizer="nlminb" )

## S3 method for class 'rm.sdt'
summary(object, file=NULL, ...)

## S3 method for class 'rm.sdt'
plot(x, ask=TRUE, ...)

## S3 method for class 'rm.sdt'
anova(object,...)

## S3 method for class 'rm.sdt'
logLik(object,...)

## S3 method for class 'rm.sdt'
IRT.factor.scores(object, type="EAP", ...)

## S3 method for class 'rm.sdt'
IRT.irfprob(object,...)

## S3 method for class 'rm.sdt'
IRT.likelihood(object,...)

## S3 method for class 'rm.sdt'
IRT.posterior(object,...)

## S3 method for class 'rm.sdt'
IRT.modelfit(object,...)

## S3 method for class 'IRT.modelfit.rm.sdt'
summary(object,...)

Arguments

`dat`	Original data frame. Ratings on variables must be in rows, i.e. every row corresponds to a person-rater combination.
`pid`	Person identifier.
`rater`	Rater identifier.
`Qmatrix`	An optional Q-matrix. If this matrix is not provided, then by default the ordinary scoring of categories (from 0 to the maximum score of `K`) is used.
`theta.k`	A grid of theta values for the ability distribution.
`est.a.item`	Should item parameters `a_i` be estimated?
`est.c.rater`	Type of estimation for item-rater parameters `c_{ir}` in the signal detection model. Options are `'n'` (no estimation), `'e'` (set all parameters equal to each other), `'i'` (itemwise estimation), `'r'` (rater wise estimation) and `'a'` (all parameters are estimated independently from each other).
`est.d.rater`	Type of estimation of `d` parameters. Options are the same as in `est.c.rater`.
`est.mean`	Optional logical indicating whether the mean of the trait distribution should be estimated.
`est.sigma`	Optional logical indicating whether the standard deviation of the trait distribution should be estimated.
`skillspace`	Specified `\theta` distribution type. It can be `"normal"` or `"discrete"`. In the latter case, all probabilities of the distribution are separately estimated.
`tau.item.fixed`	Optional matrix with three columns specifying fixed `\tau` parameters. The first two columns denote item and category indices, the third the fixed value. See Example 3.
`a.item.fixed`	Optional matrix with two columns specifying fixed `a` parameters. First column: Item index. Second column: Fixed `a` parameter.
`d.min`	Minimal `d` parameter to be estimated
`d.max`	Maximal `d` parameter to be estimated
`d.start`	Starting value(s) of `d` parameters
`c.start`	Starting values of `c` parameters
`tau.start`	Starting values of `\tau` parameters
`sd.start`	Starting value for trait standard deviation
`d.prior`	Normal prior `N(M,S^2)` for `d` parameters
`c.prior`	Normal prior for `c` parameters. The prior for parameter `c_{irk}` is defined as `M \cdot ( k - 0.5)` where `M` is `c.prior[1]`.
`tau.prior`	Normal prior for `\tau` parameters
`a.prior`	Normal prior for `a` parameters
`link_item`	Type of item response function for latent responses. Can be `"GPCM"` for the generalized partial credit model or `"GRM"` for the graded response model.
`max.increment`	Maximum increment of item parameters during estimation
`numdiff.parm`	Numerical differentiation step width
`maxdevchange`	Maximum relative deviance change as a convergence criterion
`globconv`	Maximum parameter change
`maxiter`	Maximum number of iterations
`msteps`	Maximum number of iterations during an M step
`mstepconv`	Convergence criterion in an M step
`optimizer`	Choice of optimization function in M-step for item parameters. Options are `"nlminb"` for `stats::nlminb` and `"optim"` for `stats::optim`.
`object`	Object of class `rm.sdt`
`file`	Optional file name in which summary should be written.
`x`	Object of class `rm.sdt`
`ask`	Optional logical indicating whether a new plot should be asked for.
`type`	Factor score estimation method. Up to now, only `type="EAP"` is supported.
`...`	Further arguments to be passed

Details

The specification of the model follows DeCarlo et al. (2011). The second level models the ideal rating (latent response) \eta=0, ...,K of person p on item i. The option link_item='GPCM' follows the generalized partial credit model

P( \eta_{pi}=\eta | \theta_p ) \propto exp( a_{i} q_{i \eta } \theta_p - \tau_{i \eta } )

. The option link_item='GRM' employs the graded response model

P( \eta_{pi}=\eta | \theta_p )= \Psi( \tau_{i,\eta + 1} - a_i \theta_p ) - \Psi( \tau_{i,\eta} - a_i \theta_p )

At the first level, the ratings X_{pir} for person p on item i and rater r are modeled as a signal detection model

P( X_{pir} \le k | \eta_{pi} )= G( c_{irk} - d_{ir} \eta_{pi} )

where G is the logistic distribution function and the categories are k=1,\ldots, K+1. Note that the item response model can be equivalently written as

P( X_{pir} \ge k | \eta_{pi} )= G( d_{ir} \eta_{pi} - c_{irk})

The thresholds c_{irk} can be further restricted to c_{irk}=c_{k} (est.c.rater='e'), c_{irk}=c_{ik} (est.c.rater='i') or c_{irk}=c_{ir} (est.c.rater='r'). The same holds for rater precision parameters d_{ir}.

Value

A list with following entries:

`deviance`	Deviance
`ic`	Information criteria and number of parameters
`item`	Data frame with item parameters. The columns `N` and `M` denote the number of observed ratings and the observed mean of all ratings, respectively. In addition to item parameters `\tau_{ik}` and `a_i`, the mean for the latent response (`latM`) is computed as `E( \eta_i )=\sum_p P( \theta_p ) q_{ik} P( \eta_i=k \| \theta_p )` which provides an item parameter at the original metric of ratings. The latent standard deviation (`latSD`) is computed in the same manner.
`rater`	Data frame with rater parameters. Transformed `c` parameters (`c_x.trans`) are computed as `c_{irk} / ( d_{ir} )`.
`person`	Data frame with person parameters: EAP and corresponding standard errors
`EAP.rel`	EAP reliability
`EAP.rel`	EAP reliability
`mu`	Mean of the trait distribution
`sigma`	Standard deviation of the trait distribution
`tau.item`	Item parameters `\tau_{ik}`
`se.tau.item`	Standard error of item parameters `\tau_{ik}`
`a.item`	Item slopes `a_i`
`se.a.item`	Standard error of item slopes `a_i`
`c.rater`	Rater parameters `c_{irk}`
`se.c.rater`	Standard error of rater severity parameter `c_{irk}`
`d.rater`	Rater slope parameter `d_{ir}`
`se.d.rater`	Standard error of rater slope parameter `d_{ir}`
`f.yi.qk`	Individual likelihood
`f.qk.yi`	Individual posterior distribution
`probs`	Item probabilities at grid `theta.k`. Note that these probabilities are calculated on the pseudo items `i \times r`, i.e. the interaction of item and rater.
`prob.item`	Probabilities `P( \eta_i=\eta \| \theta )` of latent item responses evaluated at theta grid `\theta_p`.
`n.ik`	Expected counts
`pi.k`	Estimated trait distribution `P(\theta_p)`.
`maxK`	Maximum number of categories
`procdata`	Processed data
`iter`	Number of iterations
`...`	Further values

References

DeCarlo, L. T. (2005). A model of rater behavior in essay grading based on signal detection theory. Journal of Educational Measurement, 42, 53-76.

DeCarlo, L. T. (2010). Studies of a latent-class signal-detection model for constructed response scoring II: Incomplete and hierarchical designs. ETS Research Report ETS RR-10-08. Princeton NJ: ETS.

DeCarlo, T., Kim, Y., & Johnson, M. S. (2011). A hierarchical rater model for constructed responses, with a signal detection rater model. Journal of Educational Measurement, 48, 333-356.

Robitzsch, A., & Steinfeld, J. (2018). Item response models for human ratings: Overview, estimation methods, and implementation in R. Psychological Test and Assessment Modeling, 60(1), 101-139.

Vermunt, J. K. (2008). Latent class and finite mixture models for multilevel data sets. Statistical Methods in Medical Research, 17, 33-51.

Examples

#############################################################################
# EXAMPLE 1: Hierarchical rater model (HRM-SDT) data.ratings1
#############################################################################
data(data.ratings1)
dat <- data.ratings1

## Not run: 
# Model 1: Partial Credit Model: no rater effects
mod1 <- sirt::rm.sdt( dat[, paste0( "k",1:5) ], rater=dat$rater,
            pid=dat$idstud, est.c.rater="n", d.start=100,  est.d.rater="n" )
summary(mod1)

# Model 2: Generalized Partial Credit Model: no rater effects
mod2 <- sirt::rm.sdt( dat[, paste0( "k",1:5) ], rater=dat$rater,
            pid=dat$idstud, est.c.rater="n", est.d.rater="n",
            est.a.item=TRUE, d.start=100)
summary(mod2)

# Model 3: Equal effects in SDT
mod3 <- sirt::rm.sdt( dat[, paste0( "k",1:5) ], rater=dat$rater,
            pid=dat$idstud, est.c.rater="e", est.d.rater="e")
summary(mod3)

# Model 4: Rater effects in SDT
mod4 <- sirt::rm.sdt( dat[, paste0( "k",1:5) ], rater=dat$rater,
            pid=dat$idstud, est.c.rater="r", est.d.rater="r")
summary(mod4)

#############################################################################
# EXAMPLE 2: HRM-SDT data.ratings3
#############################################################################

data(data.ratings3)
dat <- data.ratings3
dat <- dat[ dat$rater < 814, ]
psych::describe(dat)

# Model 1: item- and rater-specific effects
mod1 <- sirt::rm.sdt( dat[, paste0( "crit",c(2:4)) ], rater=dat$rater,
            pid=dat$idstud, est.c.rater="a", est.d.rater="a" )
summary(mod1)
plot(mod1)

# Model 2: Differing number of categories per variable
mod2 <- sirt::rm.sdt( dat[, paste0( "crit",c(2:4,6)) ], rater=dat$rater,
            pid=dat$idstud, est.c.rater="a", est.d.rater="a")
summary(mod2)
plot(mod2)

#############################################################################
# EXAMPLE 3: Hierarchical rater model with discrete skill spaces
#############################################################################

data(data.ratings3)
dat <- data.ratings3
dat <- dat[ dat$rater < 814, ]
psych::describe(dat)

# Model 1: Discrete theta skill space with values of 0,1,2 and 3
mod1 <- sirt::rm.sdt( dat[, paste0( "crit",c(2:4)) ], theta.k=0:3, rater=dat$rater,
            pid=dat$idstud, est.c.rater="a", est.d.rater="a", skillspace="discrete" )
summary(mod1)
plot(mod1)

# Model 2: Modelling of one item by using a discrete skill space and
#          fixed item parameters

# fixed tau and a parameters
tau.item.fixed <- cbind( 1, 1:3,  100*cumsum( c( 0.5, 1.5, 2.5)) )
a.item.fixed <- cbind( 1, 100 )
# fit HRM-SDT
mod2 <- sirt::rm.sdt( dat[, "crit2", drop=FALSE], theta.k=0:3, rater=dat$rater,
            tau.item.fixed=tau.item.fixed,a.item.fixed=a.item.fixed, pid=dat$idstud,
            est.c.rater="a", est.d.rater="a", skillspace="discrete" )
summary(mod2)
plot(mod2)

## End(Not run)

[Package sirt version 4.1-15 Index]