genPolyMatrix {catR} | R Documentation |
Item bank generation (polytomous models)
Description
This command generates an item bank from prespecified parent distributions for use with polytomous IRT models. Subgroups of items can also be specified for content balancing purposes.
Usage
genPolyMatrix(items = 100, nrCat = 3, model = "GRM", seed = 1, same.nrCat = FALSE,
cbControl = NULL)
Arguments
items |
integer: the number of items to generate (default is 100). |
nrCat |
integer: the (maximum) number of response categories to generate (default is 3). |
model |
character: the type of polytomous IRT model. Possible values are |
seed |
numeric: the random seed for item parameter generation (default is 1). |
same.nrCat |
logical: should all items have the same number of response categories? (default is |
cbControl |
either a list of accurate format to control for content balancing, or |
Details
The genPolyMatrix
permits to quickly generate a polytomous item bank in suitable format for further use in e.g. computing item response probabilities with the Pi
.
The six polytomous IRT models that are supported are:
the Graded Response Model (GRM; Samejima, 1969);
the Modified Graded Response Model (MGRM; Muraki, 1990);
the Partial Credit Model (PCM; Masters, 1982);
the Generalized Partial Credit Model (GPCM; Muraki, 1992);
the Rating Scale Model (RSM; Andrich, 1978);
the Nominal Response Model (NRM; Bock, 1972).
Each model is specified through the model
argument, with its accronym surrounded by double quotes (i.e. "GRM"
for GRM, "PCM"
for PCM, etc.). The default value is "GRM"
.
For any item j
, set (0, ..., g_j)
as the g_j+1
possible response categories. The maximum number of response categories can differ across items under the GRM, PCM, GPCM and NRM, but they are obviously equal across items under the MGRM and RSM. In the latter, set g
as the (same) number of response categories for all items. It is possible however to require all items to have the same number of response categories, by fixing the same.nrCat
argument to TRUE
.
In case of GRM, PCM, GPCM or NRM with same.nrCat
being FALSE
, the number of response categories g_j+1
per item is drawn from a Poisson distribution with parameter nrCat
, and this number is restricted to the interval [2; nrCat
]. This ensure at least two response categories and at most nrCat
categories. In all other cases, each g_j+1
is trivially fixed to g+1 =
nrCat
.
Denote further P_{jk}(\theta)
as the probability of answering response category k \in \{0, ..., g_j\}
of item j
. For GRM and MGRM, response probabilities P_{jk}(\theta)
are defined through cumulative probabilities, while for PCM, GPCM, RSM and NRM they are directly computed.
For GRM and MGRM, set P_{jk}^*(\theta)
as the (cumulative) probability of asnwering response category k
or "above", that is P_{jk}^*(\theta) = Pr(X_j \geq k | \theta)
where X_j
is the item response. It follows obviously that for any \theta
, P_{j0}^*(\theta) = 1
and P_{jk}^*(\theta) = 0
when k>g_j
. Furthermore, response category probabilities are found back by the relationship P_{jk}(\theta)= P_{jk}^*(\theta)-P_{j,k+1}^*(\theta)
. Then, the GRM is defined by (Samejima, 1969)
P_{jk}^*(\theta)=\frac{\exp\,[\alpha_j\,(\theta-\beta_{jk})]}{1+\exp\,[\alpha_j\,(\theta-\beta_{jk})]}
and the MGRM by (Muraki, 1990)
P_{jk}^*(\theta)=\frac{\exp\,[\alpha_j\,(\theta-b_j+c_k)]}{1+\exp\,[\alpha_j\,(\theta-b_j+c_k)]}.
The PCM, GPCM, RSM and NRM are defined as "divide-by-total" models (Embretson and Reise, 2000). The PCM has following response category probability (Masters, 1982):
P_{jk}(\theta)=\frac{\exp\,\sum_{t=0}^k (\theta-\delta_{jt})}{\sum_{r=0}^{g_j}\,\exp\, \sum_{t=0}^r (\theta-\delta_{jt})}\quad \mbox{with} \quad \sum_{t=0}^0 (\theta-\delta_{jt})=0.
The GPCM has following response category probability (Muraki, 1992):
P_{jk}(\theta)=\frac{\exp\,\sum_{t=0}^k \alpha_j\,(\theta-\delta_{jt})}{\sum_{r=0}^{g_j}\,\exp\, \sum_{t=0}^r \alpha_j\,(\theta-\delta_{jt})}\quad \mbox{with} \quad \sum_{t=0}^0 \alpha_j\,(\theta-\delta_{jt})=0.
The RSM has following response category probability (Andrich, 1978):
P_{jk}(\theta)=\frac{\exp\,\sum_{t=0}^k [\theta-(\lambda_j+\delta_t)]}{\sum_{r=0}^{g_j}\,\exp\, \sum_{t=0}^r [\theta-(\lambda_j+\delta_t)]}\quad \mbox{with} \quad \sum_{t=0}^0 [\theta-(\lambda_j+\delta_t)]=0.
Finally, the NRM has following response category probability (Bock, 1972):
P_{jk}(\theta)=\frac{\exp (\alpha_{jk}\,\theta+c_{jk})}{\sum_{r=0}^{g_j} \exp (\alpha_{jr}\,\theta+c_{jr})}\quad \mbox{with} \quad \alpha_{j0}\,\theta+c_{j0}=0.
The following parent distributions are considered to generate the different item parameters. The \alpha_j
parameters of GRM, MGRM and GPCM, as well as the \alpha_{jk}
parameters of the NRM, are drawn from a log-normal distribution with mean 0 and standard deviation 0.1225. All other parameters are drawn from a standard normal distribution. Moreover, the \beta_{jk}
parameters of the GRM and the c_k
parameters of the MGRM are sorted respectively in increasing and decreasing order of k
, to ensure decreasing trend in the cumulative P_{jk}^*(\theta)
probabilities.
The output is a matrix with one row per item and as many columns as required to hold all item parameters. In case of missing response categories, the corresponding parameters are replaced by NA
values. Column names refer to the corresponding model parameters. See Details for further explanations and Examples for illustrative examples.
Finally, the output matrix can contain an additional vector with the names of the subgroups to be used for content balancing purposes. To do so, the argument cbControl
(with default value is NULL
) must contain a list of two elements: (a) the names
element with the names of the subgroups, and (b) the props
elements with proportions of items per subgroup (of the same length of names
element, with only positive numbers but not necessarily summing to one). The cbControl
argument is similar to the one in nextItem
and randomCAT
functions to control for content balancing. The output matrix contains then an additional column, with the names of the subgroups randomly allocated to each item by using random multinomial draws with the probabilities given by cbControl$props
.
Value
A matrix with items
rows and as many columns as required for the considered IRT model:
\max_j \,g_j+1
columns, holding parameters(\alpha_j, \beta_{j1}, ..., \beta_{j,g_j})
ifmodel
is"GRM"
;g+2
columns, holding parameters(\alpha_j, b_j, c_1, ..., c_g)
ifmodel
is"MGRM"
;\max_j \,g_j
columns, holding parameters(\delta_{j1}, ..., \delta_{j,g_j})
ifmodel
is"PCM"
;\max_j \,g_j+1
columns, holding parameters(\alpha_j, \delta_{j1}, ..., \delta_{j,g_j})
ifmodel
is"GPCM"
;g+1
columns, holding parameters(\lambda_j, \delta_1, ..., \delta_g)
ifmodel
is"RSM"
;2\,\max_j\, g_j
columns, holding parameters(\alpha_{j1}, c_{j1}, \alpha_{j2}, c_{j2}, ..., \alpha_{j,g_j}, c_{j, g_j})
ifmodel
is"NRM"
.
If cbControl
is not NULL
, the output matrix contains an additional colum for item membership is included.
Author(s)
David Magis
Department of Psychology, University of Liege, Belgium
david.magis@uliege.be
References
Andrich, D. (1978). A rating formulation for ordered response categories. Psychometrika, 43, 561-573. doi: 10.1007/BF02293814
Bock, R. D. (1972). Estimating item parameters and latent ability when responses are scored in two or more nominal categories. Psychometrika, 37, 29-51. doi: 10.1007/BF02291411
Embretson, S. E., and Reise, S. P. (2000). Item response theory for psychologists. Mahwah, NJ: Lawrence Erlbaum Associates.
Magis, D. and Barrada, J. R. (2017). Computerized Adaptive Testing with R: Recent Updates of the Package catR. Journal of Statistical Software, Code Snippets, 76(1), 1-18. doi: 10.18637/jss.v076.c01
Magis, D., and Raiche, G. (2012). Random Generation of Response Patterns under Computerized Adaptive Testing with the R Package catR. Journal of Statistical Software, 48 (8), 1-31. doi: 10.18637/jss.v048.i08
Masters, G. N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47, 149-174. doi: 10.1007/BF02296272
Muraki, E. (1990). Fitting a polytomous item response model to Likert-type data. Applied Psychological Measurement, 14, 59-71. doi: 10.1177/014662169001400106
Muraki, E. (1992). A generalized partial credit model: Application of an EM algorithm. Applied Psychological Measurement, 16, 19-176. doi: 10.1177/014662169201600206
Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores. Psychometrika Monograph (vol. 17).
See Also
Examples
# All generated item banks have 10 items and at most four response categories
# GRM
genPolyMatrix(10, 4, model = "GRM")
# GRM with same number of response categories
genPolyMatrix(10, 4, model = "GRM", same.nrCat = TRUE)
# MGRM
genPolyMatrix(10, 4, model = "MGRM")
# MGRM with same number of response categories
genPolyMatrix(10, 4, model = "MGRM", same.nrCat = TRUE) # same result
# PCM
genPolyMatrix(10, 4, model = "PCM")
# PCM with same number of response categories
genPolyMatrix(10, 4, model = "PCM", same.nrCat = TRUE)
# GPCM
genPolyMatrix(10, 4, model = "GPCM")
# GPCM with same number of response categories
genPolyMatrix(10, 4, model = "GPCM", same.nrCat = TRUE)
# RSM
genPolyMatrix(10, 4, model = "RSM")
# RSM with same number of response categories
genPolyMatrix(10, 4, model = "RSM", same.nrCat = TRUE) # same result
# NRM
genPolyMatrix(10, 4, model = "NRM")
# NRM with same number of response categories
genPolyMatrix(10, 4, model = "NRM", same.nrCat = TRUE)
## Content balancing
# Creation of the 'cbList' list with arbitrary proportions
cbList <- list(names = c("Audio1", "Audio2", "Written1", "Written2", "Written3"),
props = c(0.1, 0.2, 0.2, 0.2, 0.3))
# NRM with 100 items
genPolyMatrix(100, 4, model = "NRM", cbControl = cbList)