rdif {irtQ}R Documentation

IRT residual-based differential item functioning (RDIF) detection framework

Description

This function computes three RDIF statistics (Lim & Choe, In press; Lim, Choe, & Han, 2022), which are RDIF_{R}, RDIF_{S}, and RDIF_{RS}, for each item. RDIF_{R} primarily captures the typical contrast in raw residual pattern between two groups caused by uniform DIF whereas RDIF_{S} primarily captures the typical contrast in squared residual pattern between two groups caused by nonuniform DIF. RDIF_{RS} can reasonably capture both types of DIF.

Usage

rdif(x, ...)

## Default S3 method:
rdif(
  x,
  data,
  score = NULL,
  group,
  focal.name,
  D = 1,
  alpha = 0.05,
  missing = NA,
  purify = FALSE,
  purify.by = c("rdifrs", "rdifr", "rdifs"),
  max.iter = 10,
  min.resp = NULL,
  method = "ML",
  range = c(-5, 5),
  norm.prior = c(0, 1),
  nquad = 41,
  weights = NULL,
  ncore = 1,
  verbose = TRUE,
  ...
)

## S3 method for class 'est_irt'
rdif(
  x,
  score = NULL,
  group,
  focal.name,
  alpha = 0.05,
  missing = NA,
  purify = FALSE,
  purify.by = c("rdifrs", "rdifr", "rdifs"),
  max.iter = 10,
  min.resp = NULL,
  method = "ML",
  range = c(-5, 5),
  norm.prior = c(0, 1),
  nquad = 41,
  weights = NULL,
  ncore = 1,
  verbose = TRUE,
  ...
)

## S3 method for class 'est_item'
rdif(
  x,
  group,
  focal.name,
  alpha = 0.05,
  missing = NA,
  purify = FALSE,
  purify.by = c("rdifrs", "rdifr", "rdifs"),
  max.iter = 10,
  min.resp = NULL,
  method = "ML",
  range = c(-5, 5),
  norm.prior = c(0, 1),
  nquad = 41,
  weights = NULL,
  ncore = 1,
  verbose = TRUE,
  ...
)

Arguments

x

A data frame containing the item metadata (e.g., item parameters, number of categories, models ...), an object of class est_item obtained from the function est_item, or an object of class est_irt obtained from the function est_irt. The item metadata can be easily created using the function shape_df. See est_irt, irtfit, info or simdat for more details about the item metadata.

...

Additional arguments that will be forwarded to the est_score function.

data

A matrix containing examinees' response data for the items in the argument x. A row and column indicate the examinees and items, respectively.

score

A vector of examinees' ability estimates. If the abilities are not provided, rdif estimates the abilities before computing the RDIF statistics. See est_score for more details about scoring methods. Default is NULL.

group

A numeric or character vector indicating group membership of examinees. The length of the vector should be the same with the number of rows in the response data matrix.

focal.name

A single numeric or character scalar representing the level associated with the focal group. For instance, given group = c(0, 1, 0, 1, 1) and '1' indicating the focal group, set focal.name = 1.

D

A scaling factor in IRT models to make the logistic function as close as possible to the normal ogive function (if set to 1.7). Default is 1.

alpha

A numeric value to specify significance \alpha-level of the hypothesis test using the RDIF statistics. Default is .05.

missing

A value indicating missing values in the response data set. Default is NA.

purify

A logical value indicating whether a purification process will be implemented or not. Default is FALSE.

purify.by

A character string specifying a RDIF statistic with which the purification is implemented. Available statistics are "rdifrs" for RDIF_{RS}, "rdifr" for RDIF_{R}, and "rdifs" for RDIF_{S}.

max.iter

A positive integer value specifying the maximum number of iterations for the purification process. Default is 10.

min.resp

A positive integer value specifying the minimum number of item responses for an examinee required to compute the ability estimate. Default is NULL. See details below for more information.

method

A character string indicating a scoring method. Available methods are "ML" for the maximum likelihood estimation, "WL" for the weighted likelihood estimation, "MAP" for the maximum a posteriori estimation, and "EAP" for the expected a posteriori estimation. Default method is "ML".

range

A numeric vector of two components to restrict the range of ability scale for the ML, WL, EAP, and MAP scoring methods. Default is c(-5, 5).

norm.prior

A numeric vector of two components specifying a mean and standard deviation of the normal prior distribution. These two parameters are used to obtain the gaussian quadrature points and the corresponding weights from the normal distribution. Default is c(0,1). Ignored if method is "ML" or "WL".

nquad

An integer value specifying the number of gaussian quadrature points from the normal prior distribution. Default is 41. Ignored if method is "ML", "WL", or "MAP".

weights

A two-column matrix or data frame containing the quadrature points (in the first column) and the corresponding weights (in the second column) of the latent variable prior distribution. The weights and quadrature points can be easily obtained using the function gen.weight. If NULL and method is "EAP", default values are used (see the arguments of norm.prior and nquad). Ignored if method is "ML", "WL" or "MAP".

ncore

The number of logical CPU cores to use. Default is 1. See est_score for details.

verbose

A logical value. If TRUE, the progress messages of purification procedure are suppressed. Default is TRUE.

Details

The RDIF framework (Lim et al., 2022) consists of three IRT residual-based statistics: RDIF_{R}, RDIF_{S}, and RDIF_{RS}. Under the null hypothesis that a test contains no DIF items, RDIF_{R} and RDIF_{S} follow normal distributions asymptotically. RDIF_{RS} is a based on a bivariate normal distribution of RDIF_{R} and RDIF_{S} statistics. Under the null hypothesis of no DIF items, it follows a \chi^{2} distribution asymptotically with 2 degrees of freedom. See Lim et al. (2022) for more details about RDIF framework.

The rdif function computes all three RDIF statistics of RDIF_{R}, RDIF_{S}, and RDIF_{RS}. The current version of rdif function supports both dichotomous and polytomous item response data. To compute the three statistics, the rdif function requires (1) item parameter estimates obtained from aggregate data regardless of group membership, (2) examinees' ability estimates (e.g., ML), and (3) examinees' item response data. Note that the ability estimates need to be computed using the aggregate data-based item parameter estimates. The item parameter estimates should be provided in the x argument, the ability estimates should be provided in the score argument, and the response data should be provided in the data argument. When the abilities are not given in the score argument (i.e., score = NULL), the rdif function estimates examinees' abilities automatically using the scoring method specified in the method argument (e.g., method = "ML").

The group argument accepts a vector of either two distinct numeric or character variables. Between two distinct variable, one is to represent the reference group and another one is to represent the focal group. The length of the vector should be the same with the number of rows in the response data and each value in the vector should indicate each examinee of the response data. Once the gruop is specified, a single numeric or character value needs to be provided in the focal.name argument to define which group variable in the group argument represents the focal group.

As other DIF detection approaches, an iterative purification process can be implemented for the RDIF framework. When purify = TRUE, the purification process is implemented based on one of RDIF statistics specified in the purify.by argument (e.g, purify.by="rdifrs"). At each iterative purification, examinees' latent abilities are computed using purified items and scoring method specified in the method argument. The iterative purification process stops when no further DIF items are found or the process reaches a predetermined limit of iteration, which can be specified in the max.iter argument. See Lim et al. (2022) for more details about the purification procedure.

Scoring with a limited number of items can result in large standard errors, which may impact the effectiveness of DIF detection within the RDIF framework. The min.resp argument can be employed to avoid using scores with significant standard errors when calculating the RDIF statistics, particularly during the purification process. For instance, if min.resp is not NULL (e.g., min.resp=5), item responses from examinees whose total item responses fall below the specified minimum number are treated as missing values (i.e., NA). Consequently, their ability estimates become missing values and are not utilized in computing the RDIF statistics. If min.resp=NULL, an examinee's score will be computed as long as there is at least one item response for the examinee.

Value

This function returns a list of four internal objects. The four objects are:

no_purify

A list of several sub-objects containing the results of DIF analysis without a purification procedure. The sub-objects are:

dif_stat

A data frame containing the results of three RDIF statistics across all evaluated items. From the first column, each column indicates item's ID, RDIF_{R} statistic, standardized RDIF_{R}, RDIF_{S} statistic, standardized RDIF_{S}, RDIF_{RS} statistic, p-value of the RDIF_{R}, p-value of the RDIF_{S}, p-value of the RDIF_{RS}, sample size of the reference group, sample size of the focal group, and total sample size, respectively. Note that RDIF_{RS} does not have its standardized value because it is a \chi^{2} statistic.

moments

A data frame containing the moments of three RDIF statistics. From the first column, each column indicates item's ID, mean of RDIF_{R}, standard deviation of RDIF_{R}, mean of RDIF_{S}, standard deviation of RDIF_{S}, and covariance of RDIF_{R} and RDIF_{S}, respectively.

dif_item

A list of three numeric vectors showing potential DIF items flagged by each of the RDIF statistics. Each of the numeric vector means the items flagged by RDIF_{R}, RDIF_{S}, and RDIF_{RS}, respectively.

score

A vector of ability estimates used to compute the RDIF statistics.

purify

A logical value indicating whether the purification process was used.

with_purify

A list of several sub-objects containing the results of DIF analysis with a purification procedure. The sub-objects are:

purify.by

A character string indicating which RDIF statistic is used for the purification. "rdifr", "rdifs", and "rdifrs" refers to RDIF_{R}, RDIF_{S}, and RDIF_{RS}, respectively.

dif_stat

A data frame containing the results of three RDIF statistics across all evaluated items. From the first column, each column indicates item's ID, RDIF_{R} statistic, standardized RDIF_{R}, RDIF_{S} statistic, standardized RDIF_{S}, RDIF_{RS} statistic, p-value of the RDIF_{R}, p-value of the RDIF_{S}, p-value of the RDIF_{RS}, sample size of the reference group, sample size of the focal group, total sample size, and nth iteration where the RDIF statistics were computed, respectively.

moments

A data frame containing the moments of three RDIF statistics. From the first column, each column indicates item's ID, mean of RDIF_{R}, standard deviation of RDIF_{R}, mean of RDIF_{S}, standard deviation of RDIF_{S}, covariance of RDIF_{R} and RDIF_{S}, and nth iteration where the RDIF statistics were computed, respectively.

dif_item

A list of three numeric vectors showing potential DIF items flagged by each of the RDIF statistics. Each of the numeric vector means the items flagged by RDIF_{R}, RDIF_{S}, and RDIF_{RS}, respectively.

n.iter

A total number of iterations implemented for the purification.

score

A vector of final purified ability estimates used to compute the RDIF statistics.

complete

A logical value indicating whether the purification process was completed. If FALSE, it means that the purification process reached the maximum iteration number but it was not complete.

alpha

A significance \alpha-level used to compute the p-values of RDIF statistics.

Methods (by class)

Author(s)

Hwanggyu Lim hglim83@gmail.com

References

Lim, H., & Choe, E. M. (2023). Detecting differential item functioning in CAT using IRT residual DIF approach. Journal of Educational Measurement. doi:10.1111/jedm.12366.

Lim, H., Choe, E. M., & Han, K. T. (2022). A residual-based differential item functioning detection framework in item response theory. Journal of Educational Measurement, 59(1), 80-104. doi:10.1111/jedm.12313.

See Also

est_item, info, simdat, shape_df, gen.weight, est_score

Examples


# call library
library("dplyr")

## Uniform DIF detection
###############################################
# (1) manipulate true uniform DIF data
###############################################
# import the "-prm.txt" output file from flexMIRT
flex_sam <- system.file("extdata", "flexmirt_sample-prm.txt", package = "irtQ")

# select 36 of 3PLM items which are non-DIF items
par_nstd <-
  bring.flexmirt(file=flex_sam, "par")$Group1$full_df %>%
  dplyr::filter(.data$model == "3PLM") %>%
  dplyr::filter(dplyr::row_number() %in% 1:36) %>%
  dplyr::select(1:6)
par_nstd$id <- paste0("nondif", 1:36)

# generate four new items to inject uniform DIF
difpar_ref <-
  shape_df(par.drm=list(a=c(0.8, 1.5, 0.8, 1.5), b=c(0.0, 0.0, -0.5, -0.5), g=0.15),
           item.id=paste0("dif", 1:4), cats=2, model="3PLM")

# manipulate uniform DIF on the four new items by adding constants to b-parameters
# for the focal group
difpar_foc <-
  difpar_ref %>%
  dplyr::mutate_at(.vars="par.2", .funs=function(x) x + rep(0.7, 4))

# combine the 4 DIF and 36 non-DIF items for both reference and focal groups
# thus, the first four items have uniform DIF
par_ref <- rbind(difpar_ref, par_nstd)
par_foc <- rbind(difpar_foc, par_nstd)

# generate the true thetas
set.seed(123)
theta_ref <- rnorm(500, 0.0, 1.0)
theta_foc <- rnorm(500, 0.0, 1.0)

# generate the response data
resp_ref <- simdat(par_ref, theta=theta_ref, D=1)
resp_foc <- simdat(par_foc, theta=theta_foc, D=1)
data <- rbind(resp_ref, resp_foc)

###############################################
# (2) estimate the item and ability parameters
#     using the aggregate data
###############################################
# estimate the item parameters
est_mod <- est_irt(data=data, D=1, model="3PLM")
est_par <- est_mod$par.est

# estimate the ability parameters using ML
score <- est_score(x=est_par, data=data, method="ML")$est.theta

###############################################
# (3) conduct DIF analysis
###############################################
# create a vector of group membership indicators
# where '1' indicates the focal group
group <- c(rep(0, 500), rep(1, 500))

# (a)-1 compute RDIF statistics by providing scores,
#       and without a purification
dif_nopuri_1 <- rdif(x=est_par, data=data, score=score,
                     group=group, focal.name=1, D=1, alpha=0.05)
print(dif_nopuri_1)

# (a)-2 compute RDIF statistics by not providing scores
#       and without a purification
dif_nopuri_2 <- rdif(x=est_par, data=data, score=NULL,
                     group=group, focal.name=1, D=1, alpha=0.05,
                     method="ML")
print(dif_nopuri_2)

# (b)-1 compute RDIF statistics with a purification
#       based on RDIF(R)
dif_puri_r <- rdif(x=est_par, data=data, score=score,
                   group=group, focal.name=1, D=1, alpha=0.05,
                   purify=TRUE, purify.by="rdifr")
print(dif_puri_r)

# (b)-2 compute RDIF statistics with a purification
#       based on RDIF(S)
dif_puri_s <- rdif(x=est_par, data=data, score=score,
                   group=group, focal.name=1, D=1, alpha=0.05,
                   purify=TRUE, purify.by="rdifs")
print(dif_puri_s)

# (b)-3 compute RDIF statistics with a purification
#       based on RDIF(RS)
dif_puri_rs <- rdif(x=est_par, data=data, score=score,
                    group=group, focal.name=1, D=1, alpha=0.05,
                    purify=TRUE, purify.by="rdifrs")
print(dif_puri_rs)



[Package irtQ version 0.2.0 Index]