R: Regularized Latent Class Analysis

reglca {CDM}

R Documentation

Regularized Latent Class Analysis

Description

Estimates the regularized latent class model for dichotomous responses based on regularization methods (Chen, Liu, Xu, & Ying, 2015; Chen, Li, Liu, & Ying, 2017). The SCAD and MCP penalty functions are available.

Usage

reglca(dat, nclasses, weights=NULL, group=NULL, regular_type="scad",
   regular_lam=0, sd_noise_init=1, item_probs_init=NULL, class_probs_init=NULL,
   random_starts=1, random_iter=20, conv=1e-05, h=1e-04, mstep_iter=10,
   maxit=1000, verbose=TRUE, prob_min=.0001)

## S3 method for class 'reglca'
summary(object, digits=4, file=NULL,  ...)

Arguments

`dat`	Matrix with dichotomous item responses. `NA`s are allowed.
`nclasses`	Number of classes
`weights`	Optional vector of sampling weights
`group`	Optional vector for grouping variable
`regular_type`	Regularization type. Can be `scad` or `mcp`. See `gdina` for more information.
`regular_lam`	Regularization parameter `\lambda`
`sd_noise_init`	Standard deviation for amount of noise in generating random starting values
`item_probs_init`	Optional matrix of initial item response probabilities
`class_probs_init`	Optional vector of class probabilities
`random_starts`	Number of random starts
`random_iter`	Number of initial iterations for random starts
`conv`	Convergence criterion
`h`	Numerical differentiation parameter
`mstep_iter`	Number of iterations in the M-step
`maxit`	Maximum number of iterations
`verbose`	Logical indicating whether convergence progress should be displayed
`prob_min`	Lower bound for probabilities in estimation
`object`	A required object of class `gdina`, obtained from a call to the function `gdina`.
`digits`	Number of digits after decimal separator to display.
`file`	Optional file name for a file in which `summary` should be sinked.
`...`	Further arguments to be passed.

Details

The regularized latent class model for dichotomous item responses assumes C latent classes. The item response probabilities P(X_i=1|c)=p_{ic} are estimated in such a way such that the number of different p_{ic} values per item is minimized. This approach eases interpretability and enables to recover the structure of a true (but unknown) cognitive diagnostic model.

Value

A list containing following elements (selection):

`item_probs`	Item response probabilities
`class_probs`	Latent class probabilities
`p.aj.xi`	Individual posterior
`p.xi.aj`	Individual likelihood
`loglike`	Log-likelihood value
`Npars`	Number of estimated parameters
`Nskillpar`	Number of skill class parameters
`G`	Number of groups
`n.ik`	Expected counts
`Nipar`	Number of item parameters
`n_reg`	Number of regularized parameters
`n_reg_item`	Number of regularized parameters per item
`item`	Data frame with item parameters
`pjk`	Item response probabilities (in an array)
`N`	Number of persons
`I`	Number of items

References

Chen, Y., Liu, J., Xu, G., & Ying, Z. (2015). Statistical analysis of Q-matrix based diagnostic classification models. Journal of the American Statistical Association, 110, 850-866.

Chen, Y., Li, X., Liu, J., & Ying, Z. (2017). Regularized latent class analysis with application in cognitive diagnosis. Psychometrika, 82, 660-692.

Examples

## Not run: 
#############################################################################
# EXAMPLE 1: Estimating a regularized LCA for DINA data
#############################################################################

#---- simulate data
I <- 12  # number of items
# define Q-matrix
q.matrix <- matrix(0,I,2)
q.matrix[ 1:(I/3), 1 ] <- 1
q.matrix[ I/3 + 1:(I/3), 2 ] <- 1
q.matrix[ 2*I/3 + 1:(I/3), c(1,2) ] <- 1
N <- 1000  # number of persons
guess <- rep(seq(.1,.3,length=I/3), 3)
slip <- .1
rho <- 0.3  # skill correlation
set.seed(987)
dat <- CDM::sim.din( N=N, q.matrix=q.matrix, guess=guess, slip=slip,
           mean=0*c( .2, -.2 ), Sigma=matrix( c( 1, rho,rho,1), 2, 2 ) )
dat <- dat$dat

#--- Model 1: Four latent classes without regularization
mod1 <- CDM::reglca(dat=dat, nclasses=4, regular_lam=0, random_starts=3,
               random_iter=10, conv=1E-4)
summary(mod1)

#--- Model 2: Four latent classes with regularization and lambda=.08
mod2 <- CDM::reglca(dat=dat, nclasses=4, regular_lam=0.08, regular_type="scad",
               random_starts=3, random_iter=10, conv=1E-4)
summary(mod2)

#--- Model 3: Four latent classes with regularization and lambda=.05 with warm start

# "warm start" -> use initial parameters from fitted model with higher lambda value
item_probs_init <- mod2$item_probs
class_probs_init <- mod2$class_probs
mod3 <- CDM::reglca(dat=dat, nclasses=4, regular_lam=0.05, regular_type="scad",
               item_probs_init=item_probs_init, class_probs_init=class_probs_init,
               random_starts=3, random_iter=10, conv=1E-4)

## End(Not run)

[Package CDM version 8.2-6 Index]