measEq.syntax {semTools} | R Documentation |
Syntax for measurement equivalence
Description
Automatically generates lavaan
model syntax to specify a confirmatory
factor analysis (CFA) model with equality constraints imposed on
user-specified measurement (or structural) parameters. Optionally returns
the fitted model (if data are provided) representing some chosen level of
measurement equivalence/invariance across groups and/or repeated measures.
Usage
measEq.syntax(configural.model, ..., ID.fac = "std.lv",
ID.cat = "Wu.Estabrook.2016", ID.thr = c(1L, 2L), group = NULL,
group.equal = "", group.partial = "", longFacNames = list(),
longIndNames = list(), long.equal = "", long.partial = "",
auto = "all", warn = TRUE, debug = FALSE, return.fit = FALSE)
Arguments
configural.model |
A model with no measurement-invariance constraints
(i.e., representing only configural invariance), unless required for model
identification.
Note that the specified or fitted model must not contain any latent structural parameters (i.e., it must be a CFA model), unless they are higher-order constructs with latent indicators (i.e., a second-order CFA). |
... |
Additional arguments (e.g., |
ID.fac |
See Kloessner & Klopp (2019) for details about all three methods. |
ID.cat |
See Details and References for more information. |
ID.thr |
|
group |
optional |
group.equal |
optional |
group.partial |
optional |
longFacNames |
optional named |
longIndNames |
optional named |
long.equal |
optional |
long.partial |
optional |
auto |
Used to automatically included autocorrelated measurement errors
among repeatedly measured indicators in |
warn , debug |
|
return.fit |
|
Details
This function is a pedagogical and analytical tool to generate model syntax representing some level of measurement equivalence/invariance across any combination of multiple groups and/or repeated measures. Support is provided for confirmatory factor analysis (CFA) models with simple or complex structure (i.e., cross-loadings and correlated residuals are allowed). For any complexities that exceed the limits of automation, this function is intended to still be useful by providing a means to generate syntax that users can easily edit to accommodate their unique situations.
Limited support is provided for bifactor models and higher-order constructs.
Because bifactor models have cross-loadings by definition, the option
ID.fac = "effects.code"
is unavailable. ID.fac = "UV"
is
recommended for bifactor models, but ID.fac = "UL"
is available on
the condition that each factor has a unique first indicator in the
configural.model
. In order to maintain generality, higher-order
factors may include a mix of manifest and latent indicators, but they must
therefore require ID.fac = "UL"
to avoid complications with
differentiating lower-order vs. higher-order (or mixed-level) factors.
The keyword "loadings"
in group.equal
or long.equal
constrains factor loadings of all manifest indicators (including loadings on
higher-order factors that also have latent indicators), whereas the keyword
"regressions"
constrains factor loadings of latent indicators. Users
can edit the model syntax manually to adjust constraints as necessary, or
clever use of the group.partial
or long.partial
arguments
could make it possible for users to still automated their model syntax.
The keyword "intercepts"
constrains the intercepts of all manifest
indicators, and the keyword "means"
constrains intercepts and means
of all latent common factors, regardless of whether they are latent
indicators of higher-order factors. To test equivalence of lower-order and
higher-order intercepts/means in separate steps, the user can either
manually edit their generated syntax or conscientiously exploit the
group.partial
or long.partial
arguments as necessary.
ID.fac
: If the configural.model
fixes any (e.g.,
the first) factor loadings, the generated syntax object will retain those
fixed values. This allows the user to retain additional constraints that
might be necessary (e.g., if there are only 1 or 2 indicators). Some methods
must be used in conjunction with other settings:
-
ID.cat = "Millsap"
requiresID.fac = "UL"
andparameterization = "theta"
. -
ID.cat = "LISREL"
requiresparameterization = "theta"
. -
ID.fac = "effects.code"
is unavailable when there are any cross-loadings.
ID.cat
: Wu & Estabrook (2016) recommended constraining
thresholds to equality first, and doing so should allow releasing any
identification constraints no longer needed. For each ordered
indicator, constraining one threshold to equality will allow the item's
intercepts to be estimated in all but the first group or repeated measure.
Constraining a second threshold (if applicable) will allow the item's
(residual) variance to be estimated in all but the first group or repeated
measure. For binary data, there is no independent test of threshold,
intercept, or residual-variance equality. Equivalence of thresholds must
also be assumed for three-category indicators. These guidelines provide the
least restrictive assumptions and tests, and are therefore the default.
The default setting in Mplus is similar to Wu & Estabrook (2016),
except that intercepts are always constrained to zero (so they are assumed
to be invariant without testing them). Millsap & Tein (2004) recommended
parameterization = "theta"
and identified an item's residual variance
in all but the first group (or occasion; Liu et al., 2017) by constraining
its intercept to zero and one of its thresholds to equality. A second
threshold for the reference indicator (so ID.fac = "UL"
) is used to
identify the common-factor means in all but the first group/occasion. The
LISREL software fixes the first threshold to zero and (if applicable) the
second threshold to 1, and assumes any remaining thresholds to be equal
across groups / repeated measures; thus, the intercepts are always
identified, and residual variances (parameterization = "theta"
) are
identified except for binary data, when they are all fixed to one.
Repeated Measures: If each repeatedly measured factor is measured
by the same indicators (specified in the same order in the
configural.model
) on each occasion, without any cross-loadings, the
user can let longIndNames
be automatically generated. Generic names
for the repeatedly measured indicators are created using the name of the
repeatedly measured factors (i.e., names(longFacNames)
) and the
number of indicators. So the repeatedly measured first indicator
("ind"
) of a longitudinal construct called "factor" would be
generated as "._factor_ind.1"
.
The same types of parameter can be specified for long.equal
as for
group.equal
(see lavOptions
for a list), except
for "residual.covariances"
or "lv.covariances"
. Instead, users
can constrain autocovariances using keywords "resid.autocov"
or "lv.autocov"
. Note that group.equal = "lv.covariances"
or
group.equal = "residual.covariances"
will constrain any
autocovariances across groups, along with any other covariances the user
specified in the configural.model
. Note also that autocovariances
cannot be specified as exceptions in long.partial
, so anything more
complex than the auto
argument automatically provides should instead
be manually specified in the configural.model
.
When users set orthogonal=TRUE
in the configural.model
(e.g.,
in bifactor models of repeatedly measured constructs), autocovariances of
each repeatedly measured factor will still be freely estimated in the
generated syntax.
Missing Data: If users wish to utilize the auxiliary
function to automatically include auxiliary variables in conjunction with
missing = "FIML"
, they should first generate the hypothesized-model
syntax, then submit that syntax as the model to auxiliary()
.
If users utilized runMI
to fit their configural.model
to multiply imputed data, that model can also be passed to the
configural.model
argument, and if return.fit = TRUE
, the
generated model will be fitted to the multiple imputations.
Value
By default, an object of class measEq.syntax
.
If return.fit = TRUE
, a fitted lavaan
model, with the measEq.syntax
object stored in the
@external
slot, accessible by fit@external$measEq.syntax
.
Author(s)
Terrence D. Jorgensen (University of Amsterdam; TJorgensen314@gmail.com)
References
Kloessner, S., & Klopp, E. (2019). Explaining constraint interaction: How to interpret estimated model parameters under alternative scaling methods. Structural Equation Modeling, 26(1), 143–155. doi:10.1080/10705511.2018.1517356
Liu, Y., Millsap, R. E., West, S. G., Tein, J.-Y., Tanaka, R., & Grimm, K. J. (2017). Testing measurement invariance in longitudinal data with ordered-categorical measures. Psychological Methods, 22(3), 486–506. doi:10.1037/met0000075
Millsap, R. E., & Tein, J.-Y. (2004). Assessing factorial invariance in ordered-categorical measures. Multivariate Behavioral Research, 39(3), 479–515. doi:10.1207/S15327906MBR3903_4
Wu, H., & Estabrook, R. (2016). Identification of confirmatory factor analysis models of different levels of invariance for ordered categorical outcomes. Psychometrika, 81(4), 1014–1045. doi:10.1007/s11336-016-9506-0
See Also
Examples
mod.cat <- ' FU1 =~ u1 + u2 + u3 + u4
FU2 =~ u5 + u6 + u7 + u8 '
## the 2 factors are actually the same factor (FU) measured twice
longFacNames <- list(FU = c("FU1","FU2"))
## CONFIGURAL model: no constraints across groups or repeated measures
syntax.config <- measEq.syntax(configural.model = mod.cat,
# NOTE: data provides info about numbers of
# groups and thresholds
data = datCat,
ordered = paste0("u", 1:8),
parameterization = "theta",
ID.fac = "std.lv", ID.cat = "Wu.Estabrook.2016",
group = "g", longFacNames = longFacNames)
## print lavaan syntax to the Console
cat(as.character(syntax.config))
## print a summary of model features
summary(syntax.config)
## THRESHOLD invariance:
## only necessary to specify thresholds if you have no data
mod.th <- '
u1 | t1 + t2 + t3 + t4
u2 | t1 + t2 + t3 + t4
u3 | t1 + t2 + t3 + t4
u4 | t1 + t2 + t3 + t4
u5 | t1 + t2 + t3 + t4
u6 | t1 + t2 + t3 + t4
u7 | t1 + t2 + t3 + t4
u8 | t1 + t2 + t3 + t4
'
syntax.thresh <- measEq.syntax(configural.model = c(mod.cat, mod.th),
# NOTE: data not provided, so syntax must
# include thresholds, and number of
# groups == 2 is indicated by:
sample.nobs = c(1, 1),
parameterization = "theta",
ID.fac = "std.lv", ID.cat = "Wu.Estabrook.2016",
group = "g", group.equal = "thresholds",
longFacNames = longFacNames,
long.equal = "thresholds")
## notice that constraining 4 thresholds allows intercepts and residual
## variances to be freely estimated in all but the first group & occasion
cat(as.character(syntax.thresh))
## print a summary of model features
summary(syntax.thresh)
## Fit a model to the data either in a subsequent step (recommended):
mod.config <- as.character(syntax.config)
fit.config <- cfa(mod.config, data = datCat, group = "g",
ordered = paste0("u", 1:8), parameterization = "theta")
## or in a single step (not generally recommended):
fit.thresh <- measEq.syntax(configural.model = mod.cat, data = datCat,
ordered = paste0("u", 1:8),
parameterization = "theta",
ID.fac = "std.lv", ID.cat = "Wu.Estabrook.2016",
group = "g", group.equal = "thresholds",
longFacNames = longFacNames,
long.equal = "thresholds", return.fit = TRUE)
## compare their fit to test threshold invariance
anova(fit.config, fit.thresh)
## --------------------------------------------------------
## RECOMMENDED PRACTICE: fit one invariance model at a time
## --------------------------------------------------------
## - A downside of setting return.fit=TRUE is that if the model has trouble
## converging, you don't have the opportunity to investigate the syntax,
## or even to know whether an error resulted from the syntax-generator or
## from lavaan itself.
## - A downside of automatically fitting an entire set of invariance models
## (like the old measurementInvariance() function did) is that you might
## end up testing models that shouldn't even be fitted because less
## restrictive models already fail (e.g., don't test full scalar
## invariance if metric invariance fails! Establish partial metric
## invariance first, then test equivalent of intercepts ONLY among the
## indicators that have invariate loadings.)
## The recommended sequence is to (1) generate and save each syntax object,
## (2) print it to the screen to verify you are fitting the model you expect
## to (and potentially learn which identification constraints should be
## released when equality constraints are imposed), and (3) fit that model
## to the data, as you would if you had written the syntax yourself.
## Continuing from the examples above, after establishing invariance of
## thresholds, we proceed to test equivalence of loadings and intercepts
## (metric and scalar invariance, respectively)
## simultaneously across groups and repeated measures.
## Not run:
## metric invariance
syntax.metric <- measEq.syntax(configural.model = mod.cat, data = datCat,
ordered = paste0("u", 1:8),
parameterization = "theta",
ID.fac = "std.lv", ID.cat = "Wu.Estabrook.2016",
group = "g", longFacNames = longFacNames,
group.equal = c("thresholds","loadings"),
long.equal = c("thresholds","loadings"))
summary(syntax.metric) # summarize model features
mod.metric <- as.character(syntax.metric) # save as text
cat(mod.metric) # print/view lavaan syntax
## fit model to data
fit.metric <- cfa(mod.metric, data = datCat, group = "g",
ordered = paste0("u", 1:8), parameterization = "theta")
## test equivalence of loadings, given equivalence of thresholds
anova(fit.thresh, fit.metric)
## scalar invariance
syntax.scalar <- measEq.syntax(configural.model = mod.cat, data = datCat,
ordered = paste0("u", 1:8),
parameterization = "theta",
ID.fac = "std.lv", ID.cat = "Wu.Estabrook.2016",
group = "g", longFacNames = longFacNames,
group.equal = c("thresholds","loadings",
"intercepts"),
long.equal = c("thresholds","loadings",
"intercepts"))
summary(syntax.scalar) # summarize model features
mod.scalar <- as.character(syntax.scalar) # save as text
cat(mod.scalar) # print/view lavaan syntax
## fit model to data
fit.scalar <- cfa(mod.scalar, data = datCat, group = "g",
ordered = paste0("u", 1:8), parameterization = "theta")
## test equivalence of intercepts, given equal thresholds & loadings
anova(fit.metric, fit.scalar)
## For a single table with all results, you can pass the models to
## summarize to the compareFit() function
compareFit(fit.config, fit.thresh, fit.metric, fit.scalar)
## ------------------------------------------------------
## NOT RECOMMENDED: fit several invariance models at once
## ------------------------------------------------------
test.seq <- c("thresholds","loadings","intercepts","means","residuals")
meq.list <- list()
for (i in 0:length(test.seq)) {
if (i == 0L) {
meq.label <- "configural"
group.equal <- ""
long.equal <- ""
} else {
meq.label <- test.seq[i]
group.equal <- test.seq[1:i]
long.equal <- test.seq[1:i]
}
meq.list[[meq.label]] <- measEq.syntax(configural.model = mod.cat,
data = datCat,
ordered = paste0("u", 1:8),
parameterization = "theta",
ID.fac = "std.lv",
ID.cat = "Wu.Estabrook.2016",
group = "g",
group.equal = group.equal,
longFacNames = longFacNames,
long.equal = long.equal,
return.fit = TRUE)
}
compareFit(meq.list)
## -----------------
## Binary indicators
## -----------------
## borrow example data from Mplus user guide
myData <- read.table("http://www.statmodel.com/usersguide/chap5/ex5.16.dat")
names(myData) <- c("u1","u2","u3","u4","u5","u6","x1","x2","x3","g")
bin.mod <- '
FU1 =~ u1 + u2 + u3
FU2 =~ u4 + u5 + u6
'
## Must SIMULTANEOUSLY constrain thresholds, loadings, and intercepts
test.seq <- list(strong = c("thresholds","loadings","intercepts"),
means = "means",
strict = "residuals")
meq.list <- list()
for (i in 0:length(test.seq)) {
if (i == 0L) {
meq.label <- "configural"
group.equal <- ""
long.equal <- ""
} else {
meq.label <- names(test.seq)[i]
group.equal <- unlist(test.seq[1:i])
# long.equal <- unlist(test.seq[1:i])
}
meq.list[[meq.label]] <- measEq.syntax(configural.model = bin.mod,
data = myData,
ordered = paste0("u", 1:6),
parameterization = "theta",
ID.fac = "std.lv",
ID.cat = "Wu.Estabrook.2016",
group = "g",
group.equal = group.equal,
#longFacNames = longFacNames,
#long.equal = long.equal,
return.fit = TRUE)
}
compareFit(meq.list)
## ---------------------
## Multilevel Invariance
## ---------------------
## To test invariance across levels in a MLSEM, specify syntax as though
## you are fitting to 2 groups instead of 2 levels.
mlsem <- ' f1 =~ y1 + y2 + y3
f2 =~ y4 + y5 + y6 '
## metric invariance
syntax.metric <- measEq.syntax(configural.model = mlsem, meanstructure = TRUE,
ID.fac = "std.lv", sample.nobs = c(1, 1),
group = "cluster", group.equal = "loadings")
## by definition, Level-1 means must be zero, so fix them
syntax.metric <- update(syntax.metric,
change.syntax = paste0("y", 1:6, " ~ c(0, NA)*1"))
## save as a character string
mod.metric <- as.character(syntax.metric, groups.as.blocks = TRUE)
## convert from multigroup to multilevel
mod.metric <- gsub(pattern = "group:", replacement = "level:",
x = mod.metric, fixed = TRUE)
## fit model to data
fit.metric <- lavaan(mod.metric, data = Demo.twolevel, cluster = "cluster")
summary(fit.metric)
## End(Not run)