plausibleValues {semTools} | R Documentation |
Plausible-Values Imputation of Factor Scores Estimated from a lavaan Model
Description
Draw plausible values of factor scores estimated from a fitted
lavaan
model, then treat them as multiple imputations
of missing data using runMI
.
Usage
plausibleValues(object, nDraws = 20L, seed = 12345,
omit.imps = c("no.conv", "no.se"), ...)
Arguments
object |
|
nDraws |
|
seed |
|
omit.imps |
|
... |
Optional arguments to pass to |
Details
Because latent variables are unobserved, they can be considered as missing
data, which can be imputed using Monte Carlo methods. This may be of
interest to researchers with sample sizes too small to fit their complex
structural models. Fitting a factor model as a first step,
lavPredict
provides factor-score estimates, which can
be treated as observed values in a path analysis (Step 2). However, the
resulting standard errors and test statistics could not be trusted because
the Step-2 analysis would not take into account the uncertainty about the
estimated factor scores. Using the asymptotic sampling covariance matrix
of the factor scores provided by lavPredict
,
plausibleValues
draws a set of nDraws
imputations from the
sampling distribution of each factor score, returning a list of data sets
that can be treated like multiple imputations of incomplete data. If the
data were already imputed to handle missing data, plausibleValues
also accepts an object of class lavaan.mi
, and will
draw nDraws
plausible values from each imputation. Step 2 would
then take into account uncertainty about both missing values and factor
scores. Bayesian methods can also be used to generate factor scores, as
available with the blavaan package, in which case plausible
values are simply saved parameters from the posterior distribution. See
Asparouhov and Muthen (2010) for further technical details and references.
Each returned data.frame
includes a case.idx
column that
indicates the corresponding rows in the data set to which the model was
originally fitted (unless the user requests only Level-2 variables). This
can be used to merge the plausible values with the original observed data,
but users should note that including any new variables in a Step-2 model
might not accurately account for their relationship(s) with factor scores
because they were not accounted for in the Step-1 model from which factor
scores were estimated.
If object
is a multilevel lavaan
model, users can request
plausible values for latent variables at particular levels of analysis by
setting the lavPredict
argument level=1
or
level=2
. If the level
argument is not passed via ...,
then both levels are returned in a single merged data set per draw. For
multilevel models, each returned data.frame
also includes a column
indicating to which cluster each row belongs (unless the user requests only
Level-2 variables).
Value
A list
of length nDraws
, each of which is a
data.frame
containing plausible values, which can be treated as
a list
of imputed data sets to be passed to runMI
(see Examples). If object
is of class
lavaan.mi
, the list
will be of length
nDraws*m
, where m
is the number of imputations.
Author(s)
Terrence D. Jorgensen (University of Amsterdam; TJorgensen314@gmail.com)
References
Asparouhov, T. & Muthen, B. O. (2010). Plausible values for latent variables using Mplus. Technical Report. Retrieved from www.statmodel.com/download/Plausible.pdf
See Also
Examples
## example from ?cfa and ?lavPredict help pages
HS.model <- ' visual =~ x1 + x2 + x3
textual =~ x4 + x5 + x6
speed =~ x7 + x8 + x9 '
fit1 <- cfa(HS.model, data = HolzingerSwineford1939)
fs1 <- plausibleValues(fit1, nDraws = 3,
## lavPredict() can add only the modeled data
append.data = TRUE)
lapply(fs1, head)
## To merge factor scores to original data.frame (not just modeled data)
fs1 <- plausibleValues(fit1, nDraws = 3)
idx <- lavInspect(fit1, "case.idx") # row index for each case
if (is.list(idx)) idx <- do.call(c, idx) # for multigroup models
data(HolzingerSwineford1939) # copy data to workspace
HolzingerSwineford1939$case.idx <- idx # add row index as variable
## loop over draws to merge original data with factor scores
for (i in seq_along(fs1)) {
fs1[[i]] <- merge(fs1[[i]], HolzingerSwineford1939, by = "case.idx")
}
lapply(fs1, head)
## multiple-group analysis, in 2 steps
step1 <- cfa(HS.model, data = HolzingerSwineford1939, group = "school",
group.equal = c("loadings","intercepts"))
PV.list <- plausibleValues(step1)
## subsequent path analysis
path.model <- ' visual ~ c(t1, t2)*textual + c(s1, s2)*speed '
## Not run:
step2 <- sem.mi(path.model, data = PV.list, group = "school")
## test equivalence of both slopes across groups
lavTestWald.mi(step2, constraints = 't1 == t2 ; s1 == s2')
## End(Not run)
## multilevel example from ?Demo.twolevel help page
model <- '
level: 1
fw =~ y1 + y2 + y3
fw ~ x1 + x2 + x3
level: 2
fb =~ y1 + y2 + y3
fb ~ w1 + w2
'
msem <- sem(model, data = Demo.twolevel, cluster = "cluster")
mlPVs <- plausibleValues(msem, nDraws = 3) # both levels by default
lapply(mlPVs, head, n = 10)
## only Level 1
mlPV1 <- plausibleValues(msem, nDraws = 3, level = 1)
lapply(mlPV1, head)
## only Level 2
mlPV2 <- plausibleValues(msem, nDraws = 3, level = 2)
lapply(mlPV2, head)
## example with 10 multiple imputations of missing data:
## Not run:
HSMiss <- HolzingerSwineford1939[ , c(paste("x", 1:9, sep = ""),
"ageyr","agemo","school")]
set.seed(12345)
HSMiss$x5 <- ifelse(HSMiss$x5 <= quantile(HSMiss$x5, .3), NA, HSMiss$x5)
age <- HSMiss$ageyr + HSMiss$agemo/12
HSMiss$x9 <- ifelse(age <= quantile(age, .3), NA, HSMiss$x9)
## impute data
library(Amelia)
set.seed(12345)
HS.amelia <- amelia(HSMiss, m = 10, noms = "school", p2s = FALSE)
imps <- HS.amelia$imputations
## specify CFA model from lavaan's ?cfa help page
HS.model <- '
visual =~ x1 + x2 + x3
textual =~ x4 + x5 + x6
speed =~ x7 + x8 + x9
'
out2 <- cfa.mi(HS.model, data = imps)
PVs <- plausibleValues(out2, nDraws = nPVs)
idx <- out2@Data@case.idx # can't use lavInspect() on lavaan.mi
## empty list to hold expanded imputations
impPVs <- list()
nPVs <- 5
nImps <- 10
for (m in 1:nImps) {
imps[[m]]["case.idx"] <- idx
for (i in 1:nPVs) {
impPVs[[ nPVs*(m - 1) + i ]] <- merge(imps[[m]],
PVs[[ nPVs*(m - 1) + i ]],
by = "case.idx")
}
}
lapply(impPVs, head)
## End(Not run)