detect_pk {aberrance} | R Documentation |
Detect preknowledge
Description
Detect preknowledge under the assumption that the set of compromised items is known.
Usage
detect_pk(
method,
ci,
psi,
xi = NULL,
xi_c = NULL,
xi_s = NULL,
x = NULL,
y = NULL,
interval = c(-4, 4),
alpha = 0.05,
cutoff = 0.05
)
Arguments
method |
The preknowledge detection statistic(s) to compute. Options for score-based statistics are:
Options for response time-based statistics are:
Options for score and response time-based statistics are:
|
ci |
A vector of compromised item positions. All other items are presumed secure. |
psi |
A matrix of item parameters. |
xi , xi_c , xi_s |
Matrices of person parameters. |
x , y |
Matrices of raw data. |
interval |
The interval to search for the person parameters. Default is
|
alpha |
Value(s) between 0 and 1 indicating the significance level(s)
used for flagging. Default is |
cutoff |
Use with the modified signed likelihood ratio test statistic
and the Lugannani-Rice approximation. If the absolute value of the signed
likelihood ratio test statistic is less than the cutoff (default is
|
Value
A list is returned with the following elements:
stat |
A matrix of preknowledge detection statistics. |
pval |
A matrix of p-values. |
flag |
An array of flagging results. The first dimension corresponds to persons, the second dimension to methods, and the third dimension to significance levels. |
References
Sinharay, S. (2017). Detection of item preknowledge using likelihood ratio test and score test. Journal of Educational and Behavioral Statistics, 42(1), 46–68.
Sinharay, S. (2020). Detection of item preknowledge using response times. Applied Psychological Measurement, 44(5), 376–392.
Sinharay, S., & Jensen, J. L. (2019). Higher-order asymptotics and its application to testing the equality of the examinee ability over two sets of items. Psychometrika, 84(2), 484–510.
Sinharay, S., & Johnson, M. S. (2020). The use of item scores and response times to detect examinees who may have benefited from item preknowledge. British Journal of Mathematical and Statistical Psychology, 73(3), 397–419.
See Also
detect_as()
to detect answer similarity.
Examples
# Setup for Examples 1 and 2 ------------------------------------------------
# Settings
set.seed(0) # seed for reproducibility
N <- 500 # number of persons
n <- 40 # number of items
# Randomly select 10% examinees with preknowledge and 40% compromised items
cv <- sample(1:N, size = N * 0.10)
ci <- sample(1:n, size = n * 0.40)
# Create vector of indicators (1 = preknowledge, 0 = no preknowledge)
ind <- ifelse(1:N %in% cv, 1, 0)
# Example 1: Item Scores and Response Times ---------------------------------
# Generate person parameters for the 2PL model and lognormal model
xi <- MASS::mvrnorm(
N,
mu = c(theta = 0.00, tau = 0.00),
Sigma = matrix(c(1.00, 0.25, 0.25, 0.25), ncol = 2)
)
# Generate item parameters for the 2PL model and lognormal model
psi <- cbind(
a = rlnorm(n, meanlog = 0.00, sdlog = 0.25),
b = NA,
c = 0,
alpha = runif(n, min = 1.50, max = 2.50),
beta = NA
)
# Generate positively correlated difficulty and time intensity parameters
psi[, c("b", "beta")] <- MASS::mvrnorm(
n,
mu = c(b = 0.00, beta = 3.50),
Sigma = matrix(c(1.00, 0.20, 0.20, 0.15), ncol = 2)
)
# Simulate uncontaminated data
dat <- sim(psi, xi)
x <- dat$x
y <- dat$y
# Modify contaminated data by changing the item scores and reducing the log
# response times
x[cv, ci] <- rbinom(length(cv) * length(ci), size = 1, prob = 0.90)
y[cv, ci] <- y[cv, ci] * 0.75
# Detect preknowledge
out <- detect_pk(
method = c("L_S", "ML_S", "LR_S", "S_S", "W_S", "L_T", "L_ST"),
ci = ci,
psi = psi,
x = x,
y = y
)
# Example 2: Polytomous Item Scores -----------------------------------------
# Generate person parameters for the generalized partial credit model
xi <- cbind(theta = rnorm(N, mean = 0.00, sd = 1.00))
# Generate item parameters for the generalized partial credit model
psi <- cbind(
a = rlnorm(n, meanlog = 0.00, sdlog = 0.25),
c0 = 0,
c1 = rnorm(n, mean = -1.00, sd = 0.50),
c2 = rnorm(n, mean = 0.00, sd = 0.50),
c3 = rnorm(n, mean = 1.00, sd = 0.50)
)
# Simulate uncontaminated data
x <- sim(psi, xi)$x
# Modify contaminated data by changing the item scores to the maximum score
x[cv, ci] <- 3
# Detect preknowledge
out <- detect_pk(
method = c("L_S", "ML_S", "LR_S", "S_S", "W_S"),
ci = ci,
psi = psi,
x = x
)