plausibleValues {semTools}R Documentation

Plausible-Values Imputation of Factor Scores Estimated from a lavaan Model

Description

Draw plausible values of factor scores estimated from a fitted lavaan model, then treat them as multiple imputations of missing data using runMI.

Usage

plausibleValues(object, nDraws = 20L, seed = 12345,
  omit.imps = c("no.conv", "no.se"), ...)

Arguments

object

A fitted model of class lavaan, blavaan, or lavaan.mi

nDraws

integer specifying the number of draws, analogous to the number of imputed data sets. If object is of class lavaan.mi, this will be the number of draws taken per imputation. If object is of class blavaan, nDraws cannot exceed blavInspect(object, "niter") * blavInspect(bfitc, "n.chains") (number of MCMC samples from the posterior). The drawn samples will be evenly spaced (after permutation for target="stan"), using ceiling to resolve decimals.

seed

integer passed to set.seed().

omit.imps

character vector specifying criteria for omitting imputations when object is of class lavaan.mi. Can include any of c("no.conv", "no.se", "no.npd").

...

Optional arguments to pass to lavPredict. assemble will be ignored because multiple groups are always assembled into a single data.frame per draw. type will be ignored because it is set internally to type="lv".

Details

Because latent variables are unobserved, they can be considered as missing data, which can be imputed using Monte Carlo methods. This may be of interest to researchers with sample sizes too small to fit their complex structural models. Fitting a factor model as a first step, lavPredict provides factor-score estimates, which can be treated as observed values in a path analysis (Step 2). However, the resulting standard errors and test statistics could not be trusted because the Step-2 analysis would not take into account the uncertainty about the estimated factor scores. Using the asymptotic sampling covariance matrix of the factor scores provided by lavPredict, plausibleValues draws a set of nDraws imputations from the sampling distribution of each factor score, returning a list of data sets that can be treated like multiple imputations of incomplete data. If the data were already imputed to handle missing data, plausibleValues also accepts an object of class lavaan.mi, and will draw nDraws plausible values from each imputation. Step 2 would then take into account uncertainty about both missing values and factor scores. Bayesian methods can also be used to generate factor scores, as available with the blavaan package, in which case plausible values are simply saved parameters from the posterior distribution. See Asparouhov and Muthen (2010) for further technical details and references.

Each returned data.frame includes a case.idx column that indicates the corresponding rows in the data set to which the model was originally fitted (unless the user requests only Level-2 variables). This can be used to merge the plausible values with the original observed data, but users should note that including any new variables in a Step-2 model might not accurately account for their relationship(s) with factor scores because they were not accounted for in the Step-1 model from which factor scores were estimated.

If object is a multilevel lavaan model, users can request plausible values for latent variables at particular levels of analysis by setting the lavPredict argument level=1 or level=2. If the level argument is not passed via ..., then both levels are returned in a single merged data set per draw. For multilevel models, each returned data.frame also includes a column indicating to which cluster each row belongs (unless the user requests only Level-2 variables).

Value

A list of length nDraws, each of which is a data.frame containing plausible values, which can be treated as a list of imputed data sets to be passed to runMI (see Examples). If object is of class lavaan.mi, the list will be of length nDraws*m, where m is the number of imputations.

Author(s)

Terrence D. Jorgensen (University of Amsterdam; TJorgensen314@gmail.com)

References

Asparouhov, T. & Muthen, B. O. (2010). Plausible values for latent variables using Mplus. Technical Report. Retrieved from www.statmodel.com/download/Plausible.pdf

See Also

runMI, lavaan.mi

Examples


## example from ?cfa and ?lavPredict help pages
HS.model <- ' visual  =~ x1 + x2 + x3
              textual =~ x4 + x5 + x6
              speed   =~ x7 + x8 + x9 '

fit1 <- cfa(HS.model, data = HolzingerSwineford1939)
fs1 <- plausibleValues(fit1, nDraws = 3,
                       ## lavPredict() can add only the modeled data
                       append.data = TRUE)
lapply(fs1, head)

## To merge factor scores to original data.frame (not just modeled data)
fs1 <- plausibleValues(fit1, nDraws = 3)
idx <- lavInspect(fit1, "case.idx")      # row index for each case
if (is.list(idx)) idx <- do.call(c, idx) # for multigroup models
data(HolzingerSwineford1939)             # copy data to workspace
HolzingerSwineford1939$case.idx <- idx   # add row index as variable
## loop over draws to merge original data with factor scores
for (i in seq_along(fs1)) {
  fs1[[i]] <- merge(fs1[[i]], HolzingerSwineford1939, by = "case.idx")
}
lapply(fs1, head)


## multiple-group analysis, in 2 steps
step1 <- cfa(HS.model, data = HolzingerSwineford1939, group = "school",
            group.equal = c("loadings","intercepts"))
PV.list <- plausibleValues(step1)

## subsequent path analysis
path.model <- ' visual ~ c(t1, t2)*textual + c(s1, s2)*speed '
## Not run: 
step2 <- sem.mi(path.model, data = PV.list, group = "school")
## test equivalence of both slopes across groups
lavTestWald.mi(step2, constraints = 't1 == t2 ; s1 == s2')

## End(Not run)


## multilevel example from ?Demo.twolevel help page
model <- '
  level: 1
    fw =~ y1 + y2 + y3
    fw ~ x1 + x2 + x3
  level: 2
    fb =~ y1 + y2 + y3
    fb ~ w1 + w2
'
msem <- sem(model, data = Demo.twolevel, cluster = "cluster")
mlPVs <- plausibleValues(msem, nDraws = 3) # both levels by default
lapply(mlPVs, head, n = 10)
## only Level 1
mlPV1 <- plausibleValues(msem, nDraws = 3, level = 1)
lapply(mlPV1, head)
## only Level 2
mlPV2 <- plausibleValues(msem, nDraws = 3, level = 2)
lapply(mlPV2, head)



## example with 10 multiple imputations of missing data:

## Not run: 
HSMiss <- HolzingerSwineford1939[ , c(paste("x", 1:9, sep = ""),
                                      "ageyr","agemo","school")]
set.seed(12345)
HSMiss$x5 <- ifelse(HSMiss$x5 <= quantile(HSMiss$x5, .3), NA, HSMiss$x5)
age <- HSMiss$ageyr + HSMiss$agemo/12
HSMiss$x9 <- ifelse(age <= quantile(age, .3), NA, HSMiss$x9)
## impute data
library(Amelia)
set.seed(12345)
HS.amelia <- amelia(HSMiss, m = 10, noms = "school", p2s = FALSE)
imps <- HS.amelia$imputations
## specify CFA model from lavaan's ?cfa help page
HS.model <- '
  visual  =~ x1 + x2 + x3
  textual =~ x4 + x5 + x6
  speed   =~ x7 + x8 + x9
'
out2 <- cfa.mi(HS.model, data = imps)
PVs <- plausibleValues(out2, nDraws = nPVs)

idx <- out2@Data@case.idx # can't use lavInspect() on lavaan.mi
## empty list to hold expanded imputations
impPVs <- list()
nPVs <- 5
nImps <- 10
for (m in 1:nImps) {
  imps[[m]]["case.idx"] <- idx
  for (i in 1:nPVs) {
    impPVs[[ nPVs*(m - 1) + i ]] <- merge(imps[[m]],
                                          PVs[[ nPVs*(m - 1) + i ]],
                                          by = "case.idx")
  }
}
lapply(impPVs, head)


## End(Not run)


[Package semTools version 0.5-6 Index]