DataGeneration {IRTest} | R Documentation |
Generating an artificial item response dataset
Description
This function generates an artificial item response dataset allowing various options.
Usage
DataGeneration(
seed = 1,
N = 2000,
nitem_D = 0,
nitem_P = 0,
nitem_C = 0,
model_D = "2PL",
model_P = "GPCM",
latent_dist = "Normal",
item_D = NULL,
item_P = NULL,
item_C = NULL,
theta = NULL,
prob = 0.5,
d = 1.7,
sd_ratio = 1,
m = 0,
s = 1,
a_l = 0.8,
a_u = 2.5,
b_m = NULL,
b_sd = NULL,
c_l = 0,
c_u = 0.2,
categ = 5,
possible_ans = seq(0.1, 0.9, length = 5)
)
Arguments
seed |
A numeric value that is used for random sampling. Seed number can guarantee a replicability of the result. |
N |
A numeric value of the number of examinees. |
nitem_D |
A numeric value of the number of dichotomous items. |
nitem_P |
A numeric value of the number of polytomous items. |
nitem_C |
A numeric value of the number of continuous response items. |
model_D |
A vector or a character string that represents the probability model for the dichotomous items. |
model_P |
A character string that represents the probability model for the polytomous items. |
latent_dist |
A character string that determines the type of latent distribution.
Currently available options are |
item_D |
An item parameter matrix for using fixed parameter values. The number of columns should be 3: |
item_P |
An item parameter matrix for using fixed parameter values. The number of columns should be 7: |
item_C |
An item parameter matrix for using fixed parameter values. The number of columns should be 3: |
theta |
An ability parameter vector for using fixed parameter values. Default is |
prob |
A numeric value for using |
d |
A numeric value for using |
sd_ratio |
A numeric value for using |
m |
A numeric value of the overall mean of the latent distribution. The default is 0. |
s |
A numeric value of the overall standard deviation of the latent distribution. The default is 1. |
a_l |
A numeric value. The lower bound of item discrimination parameters (a). |
a_u |
A numeric value. The upper bound of item discrimination parameters (a). |
b_m |
A numeric value. The mean of item difficulty parameters (b).
If unspecified, |
b_sd |
A numeric value. The standard deviation of item difficulty parameters (b).
If unspecified, |
c_l |
A numeric value. The lower bound of item guessing parameters (c). |
c_u |
A numeric value. The lower bound of item guessing parameters (c). |
categ |
A scalar or a numeric vector of length |
possible_ans |
Possible options for continuous items (e.g., 0.1, 0.3, 0.5, 0.7, 0.9) |
Value
This function returns a list
of several objects:
theta |
A vector of ability parameters ( |
item_D |
A matrix of dichotomous item parameters. |
initialitem_D |
A matrix that contains initial item parameter values for dichotomous items. |
data_D |
A matrix of dichotomous item responses where rows indicate examinees and columns indicate items. |
item_P |
A matrix of polytomous item parameters. |
initialitem_P |
A matrix that contains initial item parameter values for polytomous items. |
data_P |
A matrix of polytomous item responses where rows indicate examinees and columns indicate items. |
item_D |
A matrix of continuous response item parameters. |
initialitem_D |
A matrix that contains initial item parameter values for continuous response items. |
data_D |
A matrix of continuous response item responses where rows indicate examinees and columns indicate items. |
Author(s)
Seewoo Li cu@yonsei.ac.kr
References
Li, S. (2021). Using a two-component normal mixture distribution as a latent distribution in estimating parameters of item response models. Journal of Educational Evaluation, 34(4), 759-789.
Examples
# Dichotomous item responses
Alldata <- DataGeneration(N = 500,
nitem_D = 10)
# Polytomous item responses
Alldata <- DataGeneration(N = 1000,
nitem_P = 10)
# Mixed-format items
Alldata <- DataGeneration(N = 1000,
nitem_D = 20,
nitem_P = 10)
# Continuous items
AllData <- DataGeneration(N = 1000,
nitem_C = 10)
# Dataset from non-normal latent density using two-component Gaussian mixture distribution
Alldata <- DataGeneration(N=1000,
nitem_P = 10,
latent_dist = "2NM",
d = 1.664,
sd_ratio = 2,
prob = 0.3)