est_irt {irtQ} | R Documentation |
Item parameter estimation using MMLE-EM algorithm
Description
This function fits unidimensional item response (IRT) models to a mixture of dichotomous and polytomous data using the marginal maximum likelihood estimation via the expectation-maximization (MMLE-EM) algorithm (Bock & Aitkin, 1981). This function also implements the fixed item parameter calibration (FIPC; Kim, 2006). As Method A (Stocking, 1988), FIPC is one of useful online item calibration methods for computerized adaptive testing (CAT) to put the parameter estimates of pretest items on the same scale of operational item parameter estimates (Ban, Hanson, Wang, Yi, & Harris, 2001). For dichotomous items, IRT one-, two-, and three-parameter logistic models are available. For polytomous items, the graded response model (GRM) and the (generalized) partial credit model (GPCM) are available.
Usage
est_irt(
x = NULL,
data,
D = 1,
model = NULL,
cats = NULL,
item.id = NULL,
fix.a.1pl = FALSE,
fix.a.gpcm = FALSE,
fix.g = FALSE,
a.val.1pl = 1,
a.val.gpcm = 1,
g.val = 0.2,
use.aprior = FALSE,
use.bprior = FALSE,
use.gprior = TRUE,
aprior = list(dist = "lnorm", params = c(0, 0.5)),
bprior = list(dist = "norm", params = c(0, 1)),
gprior = list(dist = "beta", params = c(5, 16)),
missing = NA,
Quadrature = c(49, 6),
weights = NULL,
group.mean = 0,
group.var = 1,
EmpHist = FALSE,
use.startval = FALSE,
Etol = 1e-04,
MaxE = 500,
control = list(iter.max = 200),
fipc = FALSE,
fipc.method = "MEM",
fix.loc = NULL,
fix.id = NULL,
se = TRUE,
verbose = TRUE
)
Arguments
x |
A data frame containing the item metadata. This metadata is necessary to obtain the information of
each item (i.e., number of score categories and IRT model) to be calibrated. You can easily create an empty
item metadata using the function |
data |
A matrix containing examinees' response data for the items in the argument |
D |
A scaling factor in IRT models to make the logistic function as close as possible to the normal ogive function (if set to 1.7). Default is 1. |
model |
A vector of character strings indicating what IRT model is used to calibrate each item. Available IRT models are
"1PLM", "2PLM", "3PLM", and "DRM" for dichotomous items, and "GRM" and "GPCM" for polytomous items. "GRM" and "GPCM" represent the graded
response model and (generalized) partial credit model, respectively. Note that "DRM" is considered as "3PLM" in this function.
If a single character of the IRT model is specified, that model will be recycled across all items. The provided information in the |
cats |
A numeric vector specifying the number of score categories for each item. For example, a dichotomous
item has two score categories. If a single numeric value is specified, that value will be recycled across all items. If |
item.id |
A character vector of item IDs. If NULL, the item IDs are generated automatically. When |
fix.a.1pl |
A logical value. If TRUE, the slope parameters of the 1PLM items are fixed to a specific value specified in the argument
|
fix.a.gpcm |
A logical value. If TRUE, the GPCM items are calibrated with the partial credit model and the slope parameters of
the GPCM items are fixed to a specific value specified in the argument |
fix.g |
A logical value. If TRUE, the guessing parameters of the 3PLM items are fixed to a specific value specified in the argument
|
a.val.1pl |
A numeric value. This value is used to fixed the slope parameters of the 1PLM items. |
a.val.gpcm |
A numeric value. This value is used to fixed the slope parameters of the GPCM items. |
g.val |
A numeric value. This value is used to fixed the guessing parameters of the 3PLM items. |
use.aprior |
A logical value. If TRUE, a prior distribution for the slope parameters is used for the parameter calibration across all items. Default is FALSE. |
use.bprior |
A logical value. If TRUE, a prior distribution for the difficulty (or threshold) parameters is used for the parameter calibration across all items. Default is FALSE. |
use.gprior |
A logical value. If TRUE, a prior distribution for the guessing parameters is used for the parameter calibration across all 3PLM items. Default is TRUE. |
aprior |
A list containing the information of the prior distribution for item slope parameters. Three probability distributions
of Beta, Log-normal, and Normal distributions are available. In the list, a character string of the distribution name must be specified
in the first internal argument and a vector of two numeric values for the two parameters of the distribution must be specified in the
second internal argument. Specifically, when Beta distribution is used, "beta" should be specified in the first argument. When Log-normal
distribution is used, "lnorm" should be specified in the first argument. When Normal distribution is used, "norm" should be specified
in the first argument. In terms of the two parameters of the three distributions, see |
bprior |
A list containing the information of the prior distribution for item difficulty (or threshold) parameters. Three probability distributions
of Beta, Log-normal, and Normal distributions are available. In the list, a character string of the distribution name must be specified
in the first internal argument and a vector of two numeric values for the two parameters of the distribution must be specified in the
second internal argument. Specifically, when Beta distribution is used, "beta" should be specified in the first argument. When Log-normal
distribution is used, "lnorm" should be specified in the first argument. When Normal distribution is used, "norm" should be specified
in the first argument. In terms of the two parameters of the three distributions, see |
gprior |
A list containing the information of the prior distribution for item guessing parameters. Three probability distributions
of Beta, Log-normal, and Normal distributions are available. In the list, a character string of the distribution name must be specified
in the first internal argument and a vector of two numeric values for the two parameters of the distribution must be specified in the
second internal argument. Specifically, when Beta distribution is used, "beta" should be specified in the first argument. When Log-normal
distribution is used, "lnorm" should be specified in the first argument. When Normal distribution is used, "norm" should be specified
in the first argument. In terms of the two parameters of the three distributions, see |
missing |
A value indicating missing values in the response data set. Default is NA. |
Quadrature |
A numeric vector of two components specifying the number of quadrature points (in the first component) and the symmetric minimum and maximum values of these points (in the second component). For example, a vector of c(49, 6) indicates 49 rectangular quadrature points over -6 and 6. The quadrature points are used in the E step of the EM algorithm. Default is c(49, 6). |
weights |
A two-column matrix or data frame containing the quadrature points (in the first column) and the corresponding weights
(in the second column) of the latent variable prior distribution. If not NULL, the scale of the latent ability distribution will be will be fixed
to the scale of the provided quadrature points and weights. The weights and quadrature points can be easily obtained
using the function |
group.mean |
A numeric value to set the mean of latent variable prior distribution when |
group.var |
A positive numeric value to set the variance of latent variable prior distribution when |
EmpHist |
A logical value. If TRUE, the empirical histogram of the latent variable prior distribution is simultaneously estimated with the item parameters using Woods's (2007) approach. The items are calibrated against the estimated empirical prior distributions. See below for details. |
use.startval |
A logical value. If TRUE, the item parameters provided in the item metadata (i.e., the argument |
Etol |
A positive numeric value. This value sets the convergence criterion for E steps of the EM algorithm. Default is 1e-4. |
MaxE |
A positive integer value. This value determines the maximum number of the E steps in the EM algorithm. Default is 500. |
control |
A list of control parameters to be passed to the optimization function of |
fipc |
A logical value. If TRUE, FIPC is implemented for item parameter estimation. When |
fipc.method |
A character string specifying the FIPC method. Available methods include "OEM" for "No Prior Weights Updating and One EM Cycle
(NWU-OEM; Wainer & Mislevy, 1990)" and "MEM" for "Multiple Prior Weights Updating and Multiple EM Cycles (MWU-MEM; Kim, 2006)."
When |
fix.loc |
A vector of positive integer values specifying the locations of the items to be fixed in the item metadata (i.e., |
fix.id |
A vector of character strings specifying IDs of the items to be fixed when the FIPC is implemented (i.e., |
se |
A logical value. If FALSE, the standard errors of the item parameter estimates are not computed. Default is TRUE. |
verbose |
A logical value. If FALSE, all progress messages including the process information on the EM algorithm are suppressed. Default is TRUE. |
Details
A specific form of a data frame should be used for the argument x
. The first column should have item IDs,
the second column should contain unique score category numbers of the items, and the third column should include IRT models being fit to the items.
The available IRT models are "1PLM", "2PLM", "3PLM", and "DRM" for dichotomous item data, and "GRM" and "GPCM" for polytomous item data.
Note that "DRM" covers all dichotomous IRT models (i.e, "1PLM", "2PLM", and "3PLM") and "GRM" and "GPCM" represent the graded
response model and (generalized) partial credit model, respectively. The next columns should include the item parameters of the fitted IRT models.
For dichotomous items, the fourth, fifth, and sixth columns represent the item discrimination (or slope), item difficulty, and
item guessing parameters, respectively. When "1PLM" and "2PLM" are specified in the third column, NAs should be inserted in the sixth column
for the item guessing parameters. For polytomous items, the item discrimination (or slope) parameters should be included in the
fourth column and the item difficulty (or threshold) parameters of category boundaries should be contained from the fifth to the last columns.
When the number of unique score categories differs between items, the empty cells of item parameters should be filled with NAs.
In the irtQ package, the item difficulty (or threshold) parameters of category boundaries for GPCM are expressed as
the item location (or overall difficulty) parameter subtracted by the threshold parameter for unique score categories of the item.
Note that when an GPCM item has K unique score categories, K-1 item difficulty parameters are necessary because
the item difficulty parameter for the first category boundary is always 0. For example, if an GPCM item has five score categories,
four item difficulty parameters should be specified. An example of a data frame with a single-format test is as follows:
ITEM1 | 2 | 1PLM | 1.000 | 1.461 | NA |
ITEM2 | 2 | 2PLM | 1.921 | -1.049 | NA |
ITEM3 | 2 | 3PLM | 1.736 | 1.501 | 0.203 |
ITEM4 | 2 | 3PLM | 0.835 | -1.049 | 0.182 |
ITEM5 | 2 | DRM | 0.926 | 0.394 | 0.099 |
And an example of a data frame for a mixed-format test is as follows:
ITEM1 | 2 | 1PLM | 1.000 | 1.461 | NA | NA | NA |
ITEM2 | 2 | 2PLM | 1.921 | -1.049 | NA | NA | NA |
ITEM3 | 2 | 3PLM | 0.926 | 0.394 | 0.099 | NA | NA |
ITEM4 | 2 | DRM | 1.052 | -0.407 | 0.201 | NA | NA |
ITEM5 | 4 | GRM | 1.913 | -1.869 | -1.238 | -0.714 | NA |
ITEM6 | 5 | GRM | 1.278 | -0.724 | -0.068 | 0.568 | 1.072 |
ITEM7 | 4 | GPCM | 1.137 | -0.374 | 0.215 | 0.848 | NA |
ITEM8 | 5 | GPCM | 1.233 | -2.078 | -1.347 | -0.705 | -0.116 |
See IRT Models
section in the page of irtQ-package
for more details about the IRT models used in the irtQ package.
An easier way to create a data frame for the argument x
is by using the function shape_df
.
To fit the IRT models to data, the IRT model and the number of score category information for the estimated items must be provided as well as
the item response data. There are two way to provide the IRT model and score category information. The first way is to provide the item metadata
to the argument x
. As explained above, the item metadata can be easily created by the function shape_df
. The second way is
specify the IRT models and the score category information into the arguments of model
and cats
. Thus, if x=NULL
, the specified
information in model
and cats
are used.
To implement FIPC, however, the item metadata must be provided in the argument x
. This is because the item parameters of the fixed items
in the item metadata are used to estimate the characteristic of the underlying latent variable prior distribution when calibrating the rest of freely estimated items.
More specifically, the underlying latent variable prior distribution of the fixed items is estimated during the calibration of the freely estimated items
to put the item parameters of the freely estimated items on the scale of the fixed item parameters (Kim, 2006).
In terms of approaches for FIPC, Kim (2006) described five different methods. Among them, two methods are available in the
function est_irt
. The first method is "NWU-OEM" where uses just one E step in the EM algorithm, involving data from only the fixed items, and
just one M step, involving data from only non-fixed items. This method is suggested by Wainer and Mislevy (1990) in the context of online calibration. This method
can be implemented by setting fipc.method = "OEM"
. The second method is "MWU-MEM" which iteratively updates the latent variable prior distribution and
finds the parameter estimates of the non-fixed items. In this method, the same procedure of NWU-OEM method is applied to the first EM cycle. From the second
EM cycle, both the parameters of non-fixed items and the weights of the prior distribution are concurrently updated. This method can be implemented by
setting fipc.method = "MEM"
. See Kim (2006) for more details.
When fipc = TRUE
, the information of which items are fixed needs to be provided via either fix.loc
or fix.id
. For example, suppose that
five items in which IDs are CMC1, CMC2, CMC3, CMC4, and CMC5 should be fixed and all item IDs are provided in X
or item.id
. Also, the five items are
located in the 1st through 5th rows of the item metadata (i.e., x
). Then the item parameters of the five items can be fixed by setting
fix.loc = c(1, 2, 3, 4, 5)
or fix.id = c("CMC1", "CMC2", "CMC3", "CMC4", "CMC5")
. Note that if both arguments are not NULL, the information
provided into the fix.loc
argument is ignored.
When EmpHist = TRUE
, the empirical histogram (i.e., densities at the quadrature points) of latent variable prior distribution is simultaneously estimated
with the item parameters. If fipc = TRUE
given EmpHist = TRUE
, the scale parameters (e.g., mean and variance) of the empirical prior distribution
are estimated as well. If fipc = FALSE
given EmpHist = TRUE
, the scale parameters of the empirical prior distribution are fixed to the values specified
in the arguments of group.mean
and group.var
. When EmpHist = FALSE
, the normal prior distribution is used during the item parameter estimation.
If fipc = TRUE
given EmpHist = FALSE
, the scale parameters of the normal prior distribution are estimated as well as the item parameters.
If fipc = FALSE
given EmpHist = FALSE
, the scale parameters of the normal prior distribution are fixed to the values specified in the arguments
of group.mean
and group.var
.
Value
This function returns an object of class est_irt
. Within this object, several internal objects are contained such as:
estimates |
A data frame containing both the item parameter estimates and the corresponding standard errors of estimates. |
par.est |
A data frame containing the item parameter estimates. |
se.est |
A data frame containing the standard errors of the item parameter estimates. Note that the standard errors are estimated using the cross-production approximation method (Meilijson, 1989). |
pos.par |
A data frame containing the position number of each item parameter being estimated. The position information is useful when interpreting the variance-covariance matrix of item parameter estimates. |
covariance |
A matrix of variance-covariance matrix of item parameter estimates. |
loglikelihood |
A sum of the log-likelihood values of the observed data set (marginal log-likelihood) across all items in the data set |
aic |
A model fit statistic of Akaike information criterion based on the loglikelihood. |
bic |
A model fit statistic of Bayesian information criterion based on the loglikelihood. |
group.par |
A data frame containing the mean, variance, and standard deviation of latent variable prior distribution. |
weights |
A two-column data frame containing the quadrature points (in the first column) and the corresponding weights (in the second column) of the (updated) latent variable prior distribution. |
posterior.dist |
A matrix of normalized posterior densities for all the response patterns at each of the quadrature points. The row and column indicate each individual's response pattern and the quadrature point, respectively. |
data |
A data.frame of the examinees' response data set. |
scale.D |
A scaling factor in IRT models. |
ncase |
A total number of response patterns. |
nitem |
A total number of items included in the response data. |
Etol |
A convergence criteria for E steps of the EM algorithm. |
MaxE |
The maximum number of E steps in the EM algorithm. |
aprior |
A list containing the information of the prior distribution for item slope parameters. |
gprior |
A list containing the information of the prior distribution for item guessing parameters. |
npar.est |
A total number of the estimated parameters. |
niter |
The number of EM cycles completed. |
maxpar.diff |
A maximum item parameter change when the EM cycles were completed. |
EMtime |
Time (in seconds) spent for the EM cycles. |
SEtime |
Time (in seconds) spent for computing the standard errors of the item parameter estimates. |
TotalTime |
Time (in seconds) spent for total compuatation. |
test.1 |
Status of the first-order test to report if the gradients has vanished sufficiently for the solution to be stable. |
test.2 |
Status of the second-order test to report if the information matrix is positive definite, which is a prerequisite for the solution to be a possible maximum. |
var.note |
A note to report if the variance-covariance matrix of item parameter estimates is obtainable from the information matrix. |
fipc |
A logical value to indicate if FIPC was used. |
fipc.method |
A method used for the FIPC. |
fix.loc |
A vector of integer values specifying the locations of the fixed items when the FIPC was implemented. |
The internal objects can be easily extracted using the function getirt
.
Author(s)
Hwanggyu Lim hglim83@gmail.com
References
Ban, J. C., Hanson, B. A., Wang, T., Yi, Q., & Harris, D., J. (2001) A comparative study of on-line pretest item calibration/scaling methods in computerized adaptive testing. Journal of Educational Measurement, 38(3), 191-212.
Bock, R. D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm. Psychometrika, 46, 443-459.
Kim, S. (2006). A comparative study of IRT fixed parameter calibration methods. Journal of Educational Measurement, 43(4), 355-381.
Meilijson, I. (1989). A fast improvement to the EM algorithm on its own terms. Journal of the Royal Statistical Society: Series B (Methodological), 51, 127-138.
Stocking, M. L. (1988). Scale drift in on-line calibration (Research Rep. 88-28). Princeton, NJ: ETS.
Wainer, H., & Mislevy, R. J. (1990). Item response theory, item calibration, and proficiency estimation. In H. Wainer (Ed.), Computer adaptive testing: A primer (Chap. 4, pp.65-102). Hillsdale, NJ: Lawrence Erlbaum.
Woods, C. M. (2007). Empirical histograms in item response theory with ordinal data. Educational and Psychological Measurement, 67(1), 73-87.
See Also
est_item
, irtfit
, info
, simdat
, shape_df
, sx2_fit
,
traceline.est_irt
, getirt
Examples
##------------------------------------------------------------------------------
# 1. item parameter estimation for the dichotomous item data (LSAT6)
##------------------------------------------------------------------------------
# fit the 1PL model to LSAT6 data and constrain the slope parameters to be equal
(mod.1pl.c <- est_irt(data=LSAT6, D=1, model="1PLM", cats=2, fix.a.1pl=FALSE))
# summary of the estimation
summary(mod.1pl.c)
# extract the item parameter estimates
getirt(mod.1pl.c, what="par.est")
# extract the standard error estimates
getirt(mod.1pl.c, what="se.est")
# fit the 1PL model to LSAT6 data and fix the slope parameters to 1.0
(mod.1pl.f <- est_irt(data=LSAT6, D=1, model="1PLM", cats=2, fix.a.1pl=TRUE, a.val.1pl=1))
# summary of the estimation
summary(mod.1pl.f)
# fit the 2PL model to LSAT6 data
(mod.2pl <- est_irt(data=LSAT6, D=1, model="2PLM", cats=2))
# summary of the estimation
summary(mod.2pl)
# assess the fit of the 2PL model to the LSAT5 data using S-X2 fit statistic
(sx2fit.2pl <- sx2_fit(x=mod.2pl))
# compute the item and test information at several theta points
theta <- seq(-4, 4, 0.1)
(info.2pl <- info(x=mod.2pl, theta=theta))
# draw the test characteristic curve plot
(trace.2pl <- traceline(x=mod.2pl, theta=theta))
plot(trace.2pl)
# draw the item characteristic curve for the 1st item
plot(trace.2pl, item.loc=1)
# fit the 2PL model to LSAT6 data and
# estimate the empirical histogram of latent variable prior distribution
# also use a less stringent convergence criterion for E-step
(mod.2pl.hist <- est_irt(data=LSAT6, D=1, model="2PLM", cats=2, EmpHist=TRUE, Etol=0.001))
(emphist <- getirt(mod.2pl.hist, what="weights"))
plot(emphist$weight ~ emphist$theta, type="h")
# fit the 3PL model to LSAT6 data and use the Beta prior distribution for
# the guessing parameters
(mod.3pl <- est_irt(data=LSAT6, D=1, model="3PLM", cats=2, use.gprior=TRUE,
gprior=list(dist="beta", params=c(5, 16))))
# summary of the estimation
summary(mod.3pl)
# fit the 3PL model to LSAT6 data, but fix the guessing parameters to be 0.2
(mod.3pl.f <- est_irt(data=LSAT6, D=1, model="3PLM", cats=2, fix.g=TRUE, g.val=0.2))
# summary of the estimation
summary(mod.3pl.f)
# fit the different dichotomous models to each item of LSAT6 data
# fit the constrained 1PL model to the 1st, 2nd, and 3rd items, fit the 2PL model to
# the 4th item, and fit the 3PL model to the 5th item with the Beta prior of
# the guessing parameter
(mod.drm.mix <- est_irt(data=LSAT6, D=1, model=c("1PLM", "1PLM", "1PLM", "2PLM", "3PLM"),
cats=2, fix.a.1pl=FALSE, use.gprior=TRUE,
gprior=list(dist="beta", params=c(5, 16))))
# summary of the estimation
summary(mod.drm.mix)
##------------------------------------------------------------------------------
# 2. item parameter estimation for the mixed-item format data (simulation data)
##------------------------------------------------------------------------------
## import the "-prm.txt" output file from flexMIRT
flex_sam <- system.file("extdata", "flexmirt_sample-prm.txt", package = "irtQ")
# select the item metadata
x <- bring.flexmirt(file=flex_sam, "par")$Group1$full_df
# modify the item metadata so that the 39th and 40th items follow GPCM
x[39:40, 3] <- "GPCM"
# generate 1,000 examinees' latent abilities from N(0, 1)
set.seed(37)
score1 <- rnorm(1000, mean=0, sd=1)
# simulate the response data
sim.dat1 <- simdat(x=x, theta=score1, D=1)
# fit the 3PL model to all dichotomous items, fit the GPCM model to 39th and 40th items,
# and fit the GRM model to the 53th, 54th, 55th items.
# use the beta prior distribution for the guessing parameters, use the log-normal
# prior distribution for the slope parameters, and use the normal prior distribution
# for the difficulty (or threshold) parameters.
# also, specify the argument 'x' to provide the IRT model and score category information
# for items
item.meta <- shape_df(item.id=x$id, cats=x$cats, model=x$model, default.par=TRUE)
(mod.mix1 <- est_irt(x=item.meta, data=sim.dat1, D=1, use.aprior=TRUE, use.bprior=TRUE,
use.gprior=TRUE,
aprior=list(dist="lnorm", params=c(0.0, 0.5)),
bprior=list(dist="norm", params=c(0.0, 2.0)),
gprior=list(dist="beta", params=c(5, 16))))
# summary of the estimation
summary(mod.mix1)
# estimate examinees' latent scores given the item parameter estimates using the MLE
(score.mle <- est_score(x=mod.mix1, method = "ML", range = c(-4, 4), ncore=2))
# compute the traditional fit statistics
(fit.mix1 <- irtfit(x=mod.mix1, score=score.mle$est.theta, group.method="equal.width",
n.width=10, loc.theta="middle"))
# residual plots for the first item (dichotomous item)
plot(x=fit.mix1, item.loc=1, type = "both", ci.method = "wald",
show.table=TRUE, ylim.sr.adjust=TRUE)
# residual plots for the last item (polytomous item)
plot(x=fit.mix1, item.loc=55, type = "both", ci.method = "wald",
show.table=FALSE, ylim.sr.adjust=TRUE)
# fit the 2PL model to all dichotomous items, fit the GPCM model to 39th and 40th items,
# and fit the GRM model to the 53th, 54th, 55th items.
# also, specify the arguments of 'model' and 'cats' to provide the IRT model and
# score category information for items
(mod.mix2 <- est_irt(data=sim.dat1, D=1,
model=c(rep("2PLM", 38), rep("GPCM", 2), rep("2PLM", 12), rep("GRM", 3)),
cats=c(rep(2, 38), rep(5, 2), rep(2, 12), rep(5, 3))))
# summary of the estimation
summary(mod.mix2)
# fit the 2PL model to all dichotomous items, fit the GPCM model to 39th and 40th items,
# fit the GRM model to the 53th, 54th, 55th items, and estimate the empirical histogram
# of latent variable prior distribution.
# also, specify the arguments of 'model' and 'cats' to provide the IRT model and
# score category information for items
(mod.mix3 <- est_irt(data=sim.dat1, D=1,
model=c(rep("2PLM", 38), rep("GPCM", 2), rep("2PLM", 12), rep("GRM", 3)),
cats=c(rep(2, 38), rep(5, 2), rep(2, 12), rep(5, 3)), EmpHist=TRUE))
(emphist <- getirt(mod.mix3, what="weights"))
plot(emphist$weight ~ emphist$theta, type="h")
# fit the 2PL model to all dichotomous items,
# fit the PCM model to 39th and 40th items by fixing the slope parameters to 1,
# and fit the GRM model to the 53th, 54th, 55th items.
# also, specify the arguments of 'model' and 'cats' to provide the IRT model and
# score category information for items
(mod.mix4 <- est_irt(data=sim.dat1, D=1,
model=c(rep("2PLM", 38), rep("GPCM", 2), rep("2PLM", 12), rep("GRM", 3)),
cats=c(rep(2, 38), rep(5, 2), rep(2, 12), rep(5, 3)),
fix.a.gpcm=TRUE, a.val.gpcm=1))
# summary of the estimation
summary(mod.mix4)
##------------------------------------------------------------------------------
# 3. fixed item parameter calibration (FIPC) for the mixed-item format data
# (simulation data)
##------------------------------------------------------------------------------
## import the "-prm.txt" output file from flexMIRT
flex_sam <- system.file("extdata", "flexmirt_sample-prm.txt", package = "irtQ")
# select the item metadata
x <- bring.flexmirt(file=flex_sam, "par")$Group1$full_df
# generate 1,000 examinees' latent abilities from N(0.4, 1.3)
set.seed(20)
score2 <- rnorm(1000, mean=0.4, sd=1.3)
# simulate the response data
sim.dat2 <- simdat(x=x, theta=score2, D=1)
# fit the 3PL model to all dichotomous items, fit the GRM model to all polytomous data,
# fix the five 3PL items (1st - 5th items) and three GRM items (53rd to 55th items)
# also, estimate the empirical histogram of latent variable
# use the MEM method.
fix.loc <- c(1:5, 53:55)
(mod.fix1 <- est_irt(x=x, data=sim.dat2, D=1, use.gprior=TRUE,
gprior=list(dist="beta", params=c(5, 16)), EmpHist=TRUE,
Etol=1e-3, fipc=TRUE, fipc.method="MEM", fix.loc=fix.loc))
(prior.par <- mod.fix1$group.par)
(emphist <- getirt(mod.fix1, what="weights"))
plot(emphist$weight ~ emphist$theta, type="h")
# summary of the estimation
summary(mod.fix1)
# or the same five items can be fixed by providing their item IDs to the 'fix.id' argument
# in this case, set fix.loc = NULL
fix.id <- c(x$id[1:5], x$id[53:55])
(mod.fix1 <- est_irt(x=x, data=sim.dat2, D=1, use.gprior=TRUE,
gprior=list(dist="beta", params=c(5, 16)), EmpHist=TRUE,
Etol=1e-3, fipc=TRUE, fipc.method="MEM", fix.loc=NULL,
fix.id=fix.id))
# summary of the estimation
summary(mod.fix1)
# fit the 3PL model to all dichotomous items, fit the GRM model to all polytomous data,
# fix the five 3PL items (1st - 5th items) and three GRM items (53rd to 55th items)
# at this moment, do estimate the empirical histogram of latent variable.
# instead, estimate the scale of normal prior distribution of latent variable
# use the MEM method.
fix.loc <- c(1:5, 53:55)
(mod.fix2 <- est_irt(x=x, data=sim.dat2, D=1, use.gprior=TRUE,
gprior=list(dist="beta", params=c(5, 16)), EmpHist=FALSE,
Etol=1e-3, fipc=TRUE, fipc.method="MEM", fix.loc=fix.loc))
(prior.par <- mod.fix2$group.par)
(emphist <- getirt(mod.fix2, what="weights"))
plot(emphist$weight ~ emphist$theta, type="h")
# fit the 3PL model to all dichotomous items, fit the GRM model to all polytomous data,
# at this moment fix only the five 3PL items (1st - 5th items)
# and estimate the empirical histogram of latent variable.
# use the OEM method. Thus, only 1 EM cycle is used.
fix.loc <- c(1:5)
(mod.fix3 <- est_irt(x=x, data=sim.dat2, D=1, use.gprior=TRUE,
gprior=list(dist="beta", params=c(5, 16)), EmpHist=TRUE,
Etol=1e-3, fipc=TRUE, fipc.method="OEM", fix.loc=fix.loc))
(prior.par <- mod.fix3$group.par)
(emphist <- getirt(mod.fix3, what="weights"))
plot(emphist$weight ~ emphist$theta, type="h")
# summary of the estimation
summary(mod.fix3)
# fit the 3PL model to all dichotomous items, fit the GRM model to all polytomous data,
# at this moment fix all 55 items and estimate only the latent ability distribution
# using the MEM method.
fix.loc <- c(1:55)
(mod.fix4 <- est_irt(x=x, data=sim.dat2, D=1, EmpHist=TRUE,
Etol=1e-3, fipc=TRUE, fipc.method="MEM", fix.loc=fix.loc))
(prior.par <- mod.fix4$group.par)
(emphist <- getirt(mod.fix4, what="weights"))
plot(emphist$weight ~ emphist$theta, type="h")
# summary of the estimation
summary(mod.fix4)
# or all 55 items can be fixed by providing their item IDs to the 'fix.id' argument
# in this case, set fix.loc = NULL
fix.id <- x$id
(mod.fix4 <- est_irt(x=x, data=sim.dat2, D=1, EmpHist=TRUE,
Etol=1e-3, fipc=TRUE, fipc.method="MEM", fix.loc=NULL,
fix.id=fix.id))
# summary of the estimation
summary(mod.fix4)