R: Detect nonparametric misfit

detect_nm {aberrance}

R Documentation

Detect nonparametric misfit

Description

Detect nonparametric misfit using person-fit statistics.

Usage

detect_nm(method, x = NULL, y = NULL)

Arguments

method

The person-fit statistic(s) to compute. Options for score-based statistics are:

"G_S" for the number of Guttman errors (Guttman, 1944; see also Molenaar, 1991).
"NC_S" for the norm conformity index (Tatsuoka & Tatsuoka, 1983). Note: This statistic cannot be computed for polytomous item scores.
"U1_S" for the U1 statistic, also known as the G^* statistic (van der Flier, 1977; see also Emons, 2008).
"U3_S" for the U3 statistic (van der Flier, 1982; see also Emons, 2008).
"ZU3_S" for the ZU3 statistic (van der Flier, 1982). Note: This statistic cannot be computed for polytomous item scores.
"A_S" for the agreement index (Kane & Brennan, 1980). Note: This statistic cannot be computed for polytomous item scores.
"D_S" for the disagreement index (Kane & Brennan, 1980). Note: This statistic cannot be computed for polytomous item scores.
"E_S" for the dependability index (Kane & Brennan, 1980). Note: This statistic cannot be computed for polytomous item scores.
"C_S" for the caution index (Sato, 1975). Note: This statistic cannot be computed for polytomous item scores.
"MC_S" for the modified caution index, also known as the C^* statistic (Harnisch & Linn, 1981). Note: This statistic cannot be computed for polytomous item scores.
"PC_S" for the personal point-biserial correlation (Donlon & Fischer, 1968). Note: This statistic cannot be computed for polytomous item scores.
⁠"HT_S⁠ for the H^T statistic (Sijtsma, 1986). Note: This statistic cannot be computed for polytomous item scores.

Options for response time-based statistics are:

"KL_T" for the Kullback-Leibler divergence (Man et al., 2018).

x, y

Matrices of raw data. x is for the item scores and y the item log response times.

Value

A list is returned with the following elements:

stat

A matrix of nonparametric person-fit statistics.

References

Donlon, T. F., & Fischer, F. E. (1968). An index of an individual's agreement with group-determined item difficulties. Educational and Psychological Measurement, 28(1), 105–113.

Emons, W. H. M. (2008). Nonparametric person-fit analysis of polytomous item scores. Applied Psychological Measurement, 32(3), 224–247.

Guttman, L. (1944). A basis for scaling qualitative data. American Sociological Review, 9(2), 139–150.

Harnisch, D. L., & Linn, R. L. (1981). Analysis of item response patterns: Questionable test data and dissimilar curriculum practices. Journal of Educational Measurement, 18(3), 133–146.

Kane, M. T., & Brennan, R. L. (1980). Agreement coefficients as indices of dependability for domain referenced tests. Applied Psychological Measurement, 4(1), 105–126.

Man, K., Harring, J. R., Ouyang, Y., & Thomas, S. L. (2018). Response time based nonparametric Kullback-Leibler divergence measure for detecting aberrant test-taking behavior. International Journal of Testing, 18(2), 155–177.

Molenaar, I. W. (1991). A weighted Loevinger H-coefficient extending Mokken scaling to multicategory items. Kwantitatieve Methoden, 12(37), 97–117.

Sato, T. (1975). The construction and interpretation of S-P tables.

Sijtsma, K. (1986). A coefficient of deviance of response patterns. Kwantitatieve Methoden, 7(22), 131–145.

Tatsuoka, K. K., & Tatsuoka, M. M. (1983). Spotting erroneous rules of operation by the individual consistency index. Journal of Educational Measurement, 20(3), 221–230.

van der Flier, H. (1977) Environmental factors and deviant response patterns. In Y. H. Poortinga (Ed.), Basic problems in cross-cultural psychology. Swets & Zeitlinger Publishers.

van der Flier, H. (1982). Deviant response patterns and comparability of test scores. Journal of Cross-Cultural Psychology, 13(3), 267–298.

Examples

# Setup for Examples 1 to 3 -------------------------------------------------

# Settings
set.seed(0)     # seed for reproducibility
N <- 500        # number of persons
n <- 40         # number of items

# Randomly select 10% examinees with preknowledge and 40% compromised items
cv <- sample(1:N, size = N * 0.10)
ci <- sample(1:n, size = n * 0.40)

# Create vector of indicators (1 = misfitting, 0 = fitting)
ind <- ifelse(1:N %in% cv, 1, 0)

# Example 1: Dichotomous Item Scores ----------------------------------------

# Generate person parameters for the 3PL model
xi <- cbind(theta = rnorm(N, mean = 0.00, sd = 1.00))

# Generate item parameters for the 3PL model
psi <- cbind(
  a = rlnorm(n, meanlog = 0.00, sdlog = 0.25),
  b = rnorm(n, mean = 0.00, sd = 1.00),
  c = runif(n, min = 0.05, max = 0.30)
)

# Simulate uncontaminated data
x <- sim(psi, xi)$x

# Modify contaminated data by changing the item scores
x[cv, ci] <- rbinom(length(cv) * length(ci), size = 1, prob = 0.90)

# Detect nonparametric misfit
out <- detect_nm(
  method = c("G_S", "NC_S", "U1_S", "U3_S", "ZU3_S", "A_S", "D_S", "E_S",
             "C_S", "MC_S", "PC_S", "HT_S"),
  x = x
)

# Example 2: Polytomous Item Scores -----------------------------------------

# Generate person parameters for the generalized partial credit model
xi <- cbind(theta = rnorm(N, mean = 0.00, sd = 1.00))

# Generate item parameters for the generalized partial credit model
psi <- cbind(
  a = rlnorm(n, meanlog = 0.00, sdlog = 0.25),
  c0 = 0,
  c1 = rnorm(n, mean = -1.00, sd = 0.50),
  c2 = rnorm(n, mean = 0.00, sd = 0.50),
  c3 = rnorm(n, mean = 1.00, sd = 0.50)
)

# Simulate uncontaminated data
x <- sim(psi, xi)$x

# Modify contaminated data by changing the item scores to the maximum score
x[cv, ci] <- 3

# Detect nonparametric misfit
out <- detect_nm(
  method = c("G_S", "U1_S", "U3_S"),
  x = x
)

# Example 3: Item Response Times --------------------------------------------

# Generate person parameters for the lognormal model
xi <- cbind(tau = rnorm(N, mean = 0.00, sd = sqrt(0.25)))

# Generate item parameters for the lognormal model
psi <- cbind(
  alpha = runif(n, min = 1.50, max = 2.50),
  beta = rnorm(n, mean = 3.50, sd = sqrt(0.15))
)

# Simulate uncontaminated data
y <- sim(psi, xi)$y

# Modify contaminated data by reducing the log response times
y[cv, ci] <- y[cv, ci] * 0.75

# Detect nonparametric misfit
out <- detect_nm(
  method = "KL_T",
  y = y
)

[Package aberrance version 0.1.1 Index]