difLogistic {difR}  R Documentation 
Logistic regression DIF method
Description
Performs DIF detection using logistic regression method.
Usage
difLogistic(Data, group, focal.name, anchor = NULL, member.type = "group",
match = "score", type = "both", criterion = "LRT", alpha = 0.05,
all.cov = FALSE, purify = FALSE, nrIter = 10, p.adjust.method = NULL,
save.output = FALSE, output = c("out", "default"))
## S3 method for class 'Logistic'
print(x, ...)
## S3 method for class 'Logistic'
plot(x, plot="lrStat", item = 1, itemFit = "best", pch = 8, number = TRUE,
col = "red", colIC = rep("black", 2), ltyIC = c(1, 2), save.plot = FALSE,
save.options = c("plot", "default", "pdf"), group.names = NULL, ...)
Arguments
Data 
numeric: either the data matrix only, or the data matrix plus the vector of group membership. See Details. 
group 
numeric or character: either the vector of group membership or the column indicator (within 
focal.name 
numeric or character indicating the level of 
anchor 
either 
member.type 
character: either 
match 
specifies the type of matching criterion. Can be either 
type 
a character string specifying which DIF effects must be tested. Possible values are 
criterion 
a character string specifying which DIF statistic is computed. Possible values are 
alpha 
numeric: significance level (default is 0.05). 
all.cov 
logical: should all covariance matrices of model parameter estimates be returned (as lists) for both nested models and all items? (default is 
purify 
logical: should the method be used iteratively to purify the set of anchor items? (default is FALSE). Ignored if 
nrIter 
numeric: the maximal number of iterations in the item purification process. (default is 10). 
p.adjust.method 
either 
save.output 
logical: should the output be saved into a text file? (Default is 
output 
character: a vector of two components. The first component is the name of the output file, the second component is either the file path or

x 
the result from a 
plot 
character: the type of plot, either 
item 
numeric or character: either the number or the name of the item for which logistic curves are plotted. Used only when 
itemFit 
character: the model to be selected for drawing the item curves. Possible values are 
pch , col 
type of usual 
number 
logical: should the item number identification be printed (default is 
colIC , ltyIC 
vectors of two elements of the usual 
save.plot 
logical: should the plot be saved into a separate file? (default is 
save.options 
character: a vector of three components. The first component is the name of the output file, the second component is either the file path or

group.names 
either 
... 
other generic parameters for the 
Details
The logistic regression method (Swaminathan and Rogers, 1990) allows for detecting both uniform and nonuniform differential item functioning
without requiring an item response model approach. It consists in fitting a logistic model with the matching criterion,
the group membership and an interaction between both as covariates. The statistical significance of the parameters
related to group membership and the groupscore interaction is then evaluated by means of either the likelihoodratio
test or the Wald test. The argument type
permits to test either both uniform and nonuniform effects simultaneously (type="both"
), only uniform
DIF effect (type="udif"
) or only nonuniform DIF effect (type="nudif"
). The argument criterion
permits to select either
the likelihood ratio test (criterion=="LRT"
) or the Wald test (criterion=="Wald"
). See Logistik
for further details.
The group membership can be either a vector of two distinct values, one for the reference group and one for the focal group, or a continuous or discrete variable that acts as the "group" membership variable. In the former case, the member.type
argument is set to "group"
and the focal.name
defines which value in the group
variable stands for the focal group. In the latter case, member.type
is set to "cont"
, focal.name
is ignored and each value of the group
represents one "group" of data (that is, the DIF effects are investigated among participants relying on different values of some discrete or continuous trait). See Logistik
for further details.
The matching criterion can be either the test score or any other continuous or discrete variable to be passed in the Logistik
function. This is specified by the match
argument. By default, it takes the value "score"
and the test score (i.e. raw score) is computed. The second option is to assign to match
a vector of continuous or discrete numeric values, which acts as the matching criterion. Note that for consistency this vector should not belong to the Data
matrix.
The Data
is a matrix whose rows correspond to the subjects and columns to the items. In addition, Data
can hold the vector of group membership.
If so, group
indicates the column of Data
which corresponds to the group membership, either by specifying its name or by giving the column number.
Otherwise, group
must be a vector of same length as nrow(Data)
.
Missing values are allowed for item responses (not for group membership) but must be coded as NA
values. They are discarded from the fitting of the
logistic models (see glm
for further details).
The threshold (or cutscore) for classifying items as DIF is computed as the quantile of the chisquared distribution with lowertail
probability of one minus alpha
and with one (if type="udif"
or type="nudif"
) or two (if type="both"
) degrees of freedom.
Item purification can be performed by setting purify
to TRUE
. Purification works as follows: if at least one item is detected as functioning
differently at the first step of the process, then the data set of the next step consists in all items that are currently anchor (DIF free) items, plus the
tested item (if necessary). The process stops when either two successive applications of the method yield the same classifications of the items
(Clauser and Mazor, 1998), or when nrIter
iterations are run without obtaining two successive identical classifications. In the latter case
a warning message is printed. Note that purification is possible only if the test score is considered as the matching criterion. Thus, purify
is ignored when match
is not "score"
.
Adjustment for multiple comparisons is possible with the argument p.adjust.method
. The latter must be an acronym of one of the available adjustment methods of the p.adjust
function. According to Kim and Oshima (2013), Holm and BenjaminiHochberg adjustments (set respectively by "Holm"
and "BH"
) perform best for DIF purposes. See p.adjust
function for further details. Note that item purification is performed on original statistics and pvalues; in case of adjustment for multiple comparisons this is performed after item purification.
A prespecified set of anchor items can be provided through the anchor
argument. It must be a vector of either item names (which must match exactly the column names of Data
argument) or integer values (specifying the column numbers for item identification). In case anchor items are provided, they are used to compute the test score (matching criterion), including also the tested item. None of the anchor items are tested for DIF: the output separates anchor items and tested items and DIF results are returned only for the latter. By default it is NULL
so that no anchor item is specified. Note also that item purification is not activated when anchor items are provided (even if purify
is set to TRUE
). Moreover, if the match
argument is not set to "score"
, anchor items will not be taken into account even if anchor
is not NULL
.
The measures of effect size are provided by the difference \Delta R^2
between the R^2
coefficients of the two nested models (Nagelkerke, 1991;
GomezBenito, Dolores Hidalgo and Padilla, 2009). The effect sizes are classified as "negligible", "moderate" or "large". Two scales are available, one from
Zumbo and Thomas (1997) and one from Jodoin and Gierl (2001). The output displays the \Delta R^2
measures, together with the two classifications.
The output of the difLogistic
, as displayed by the print.Logistic
function, can be stored in a text file provided that save.output
is set to
TRUE
(the default value FALSE
does not execute the storage). In this case, the name of the text file must be given as a character string into the
first component of the output
argument (default name is "out"
), and the path for saving the text file can be given through the second component of
output
. The default value is "default"
, meaning that the file will be saved in the current working directory. Any other path can be specified as a
character string: see the Examples section for an illustration.
Two types of plots are available. The first one is obtained by setting plot="lrStat"
and it is the default option. The likelihood ratio statistics are
displayed on the Y axis, for each item. The detection threshold is displayed by a horizontal line, and items flagged as DIF are printed with the color defined by
argument col
. By default, items are spotted with their number identification (number=TRUE
); otherwise they are simply drawn as dots whose form is
given by the option pch
.
The other type of plot is obtained by setting plot="itemCurve"
. In this case, the fitted logistic curves are displayed for one specific item set by the
argument item
. The latter argument can hold either the name of the item or its number identification. If the argument itemFit
takes the value
"best"
, the curves are drawn according to the output of the best model among M_0
and M_1
. That is, two curves are drawn if the item is flagged
as DIF, and only one if the item is flagged as nonDIF. If itemFit
takes the value "null"
, then the two curves are drawn from the fitted parameters
of the null model M_0
. See Logistik
for further details on the models. The colors and types of traits for these curves are defined by means of
the arguments colIC
and ltyIC
respectively. These are set as vectors of length 2, the first element for the reference group and the second for the
focal group. Finally, the argument group.names
permits to display the names of the reference and focal groups (instead of "Reference" and "Focal") in the
legend.
Both types of plots can be stored in a figure file, either in PDF or JPEG format. Fixing save.plot
to TRUE
allows this process. The figure is defined
through the components of save.options
. The first two components perform similarly as those of the output
argument. The third component is the figure
format, with allowed values "pdf"
(default) for PDF file and "jpeg"
for JPEG file.
Value
A list of class "Logistic" with the following arguments:
Logistik 
the values of the logistic regression statistics. 
p.value 
the vector of pvalues for the logistic regression statistics. 
logitPar 
a matrix with one row per item and four columns, holding the fitted parameters of the best model (among the two tested models) for each item. 
logitSe 
a matrix with one row per item and four columns, holding the standard errors of the fitted parameters of the best model (among the two tested models) for each item. 
parM0 
the matrix of fitted parameters of the null model 
seM0 
the matrix of standard error of fitted parameters of the null model 
cov.M0 
either 
cov.M1 
either 
deltaR2 
the differences in Nagelkerke's 
alpha 
the value of 
thr 
the threshold (cutscore) for DIF detection. 
DIFitems 
either the column indicators for the items which were detected as DIF items, or "No DIF item detected". 
member.type 
the value of the 
match 
a character string, either 
type 
the value of 
p.adjust.method 
the value of the 
adjusted.p 
either 
purification 
the value of 
nrPur 
the number of iterations in the item purification process. Returned only if 
difPur 
a binary matrix with one row per iteration in the item purification process and one column per item. Zeros and ones in the ith
row refer to items which were classified respectively as nonDIF and DIF items at the (i1)th step. The first row corresponds to the initial
classification of the items. Returned only if 
convergence 
logical indicating whether the iterative item purification process stopped before the maximal number of 
names 
the names of the items. 
anchor.names 
the value of the 
criterion 
the value of the 
save.output 
the value of the 
output 
the value of the 
Author(s)
Sebastien Beland
Collectif pour le Developpement et les Applications en Mesure et Evaluation (Cdame)
Universite du Quebec a Montreal
sebastien.beland.1@hotmail.com, http://www.cdame.uqam.ca/
David Magis
Department of Psychology, University of Liege
Research Group of Quantitative Psychology and Individual Differences, KU Leuven
David.Magis@uliege.be, http://ppw.kuleuven.be/okp/home/
Gilles Raiche
Collectif pour le Developpement et les Applications en Mesure et Evaluation (Cdame)
Universite du Quebec a Montreal
raiche.gilles@uqam.ca, http://www.cdame.uqam.ca/
References
Clauser, B.E. and Mazor, K.M. (1998). Using statistical procedures to identify differential item functioning test items. Educational Measurement: Issues and Practice, 17, 3144.
Finch, W.H. and French, B. (2007). Detection of crossing differential item functioning: a comparison of four methods. Educational and Psychological Measurement, 67, 565582. doi: 10.1177/0013164406296975
GomezBenito, J., Dolores Hidalgo, M. and Padilla, J.L. (2009). Efficacy of effect size measures in logistic regression: an application for detecting DIF. Methodology, 5, 1825. doi: 10.1027/16142241.5.1.18
Hidalgo, M. D. and LopezPina, J.A. (2004). Differential item functioning detection and effect size: a comparison between logistic regression and MantelHaenszel procedures. Educational and Psychological Measurement, 64, 903915. doi: 10.1177/0013164403261769
Jodoin, M. G. and Gierl, M. J. (2001). Evaluating Type I error and power rates using an effect size measure with logistic regression procedure for DIF detection. Applied Measurement in Education, 14, 329349. doi: 10.1207/S15324818AME1404_2
Kim, J., and Oshima, T. C. (2013). Effect of multiple testing adjustment in differential item functioning detection. Educational and Psychological Measurement, 73, 458–470. doi: 10.1177/0013164412467033
Magis, D., Beland, S., Tuerlinckx, F. and De Boeck, P. (2010). A general framework and an R package for the detection of dichotomous differential item functioning. Behavior Research Methods, 42, 847862. doi: 10.3758/BRM.42.3.847
Nagelkerke, N. J. D. (1991). A note on a general definition of the coefficient of determination. Biometrika, 78, 691692. doi: 10.1093/biomet/78.3.691
Swaminathan, H. and Rogers, H. J. (1990). Detecting differential item functioning using logistic regression procedures. Journal of Educational Measurement, 27, 361370. doi: 10.1111/j.17453984.1990.tb00754.x
Zumbo, B.D. (1999). A handbook on the theory and methods of differential item functioning (DIF): logistic regression modelling as a unitary framework for binary and Likerttype (ordinal) item scores. Ottawa, ON: Directorate of Human Resources Research and Evaluation, Department of National Defense.
Zumbo, B. D. and Thomas, D. R. (1997). A measure of effect size for a modelbased approach for studying DIF. Prince George, Canada: University of Northern British Columbia, Edgeworth Laboratory for Quantitative Behavioral Science.
See Also
Examples
## Not run:
# Loading of the verbal data
data(verbal)
# Excluding the "Anger" variable
anger < verbal[,colnames(verbal)=="Anger"]
verbal < verbal[,colnames(verbal)!="Anger"]
# Testing both DIF effects simultaneously
# Three equivalent settings of the data matrix and the group membership
r < difLogistic(verbal, group=25, focal.name = 1)
difLogistic(verbal, group = "Gender", focal.name = 1)
difLogistic(verbal[,1:24], group = verbal[,25], focal.name = 1)
# Returning all covariance matrices of model parameters
difLogistic(verbal, group=25, focal.name = 1, all.cov = TRUE)
# Testing both DIF effects with the Wald test
r2 < difLogistic(verbal, group = 25, focal.name = 1, criterion = "Wald")
# Testing nonuniform DIF effect
difLogistic(verbal, group = 25, focal.name = 1, type = "nudif")
# Testing uniform DIF effect
difLogistic(verbal, group = 25, focal.name = 1, type = "udif")
# Multiple comparisons adjustment using BenjaminiHochberg method
difLogistic(verbal, group=25, focal.name = 1, p.adjust.method = "BH")
# With item purification
difLogistic(verbal, group = "Gender", focal.name = 1, purify = TRUE)
difLogistic(verbal, group = "Gender", focal.name = 1, purify = TRUE, nrIter = 5)
# With items 1 to 5 set as anchor items
difLogistic(verbal, group = 25, focal.name = 1, anchor = 1:5)
# Using anger trait score as the matching criterion
difLogistic(verbal,group = 25, focal.name = 1,match = anger)
# Using trait anger score as the group variable (i.e. testing
# for DIF with respect to trait anger score)
difLogistic(verbal[,1:24],group = anger,member.type = "cont")
# Saving the output into the "Lresults.txt" file (and default path)
r < difLogistic(verbal, group = 25, focal.name = 1, save.output = TRUE,
output = c("Lresults", "default"))
# Graphical devices
plot(r)
plot(r2)
plot(r, plot = "itemCurve", item = 1)
plot(r, plot = "itemCurve", item = 1, itemFit = "null")
plot(r, plot = "itemCurve", item = 6)
plot(r, plot = "itemCurve", item = 6, itemFit = "null")
# Plotting results and saving it in a PDF figure
plot(r, save.plot = TRUE, save.options = c("plot", "default", "pdf"))
# Changing the path, JPEG figure
path < "c:/Program Files/"
plot(r, save.plot = TRUE, save.options = c("plot", path, "jpeg"))
## End(Not run)