R: Fit a lavaan Model to Multiple Imputed Data Sets

runMI {semTools}

R Documentation

Fit a lavaan Model to Multiple Imputed Data Sets

Description

This function fits a lavaan model to a list of imputed data sets, and can also implement multiple imputation for a single data.frame with missing observations, using either the Amelia package or the mice package.

Usage

runMI(model, data, fun = "lavaan", ..., m, miArgs = list(),
  miPackage = "Amelia", seed = 12345)

lavaan.mi(model, data, ..., m, miArgs = list(), miPackage = "Amelia",
  seed = 12345)

cfa.mi(model, data, ..., m, miArgs = list(), miPackage = "Amelia",
  seed = 12345)

sem.mi(model, data, ..., m, miArgs = list(), miPackage = "Amelia",
  seed = 12345)

growth.mi(model, data, ..., m, miArgs = list(), miPackage = "Amelia",
  seed = 12345)

Arguments

`model`	The analysis model can be specified using lavaan `model.syntax` or a parameter table (as returned by `parTable`).
`data`	A `data.frame` with missing observations, or a `list` of imputed data sets (if data are imputed already). If `runMI` has already been called, then imputed data sets are stored in the `@DataList` slot, so `data` can also be a `lavaan.mi` object from which the same imputed data will be used for additional analyses.
`fun`	`character`. Name of a specific lavaan function used to fit `model` to `data` (i.e., `"lavaan"`, `"cfa"`, `"sem"`, or `"growth"`). Only required for `runMI`.
`...`	additional arguments to pass to `lavaan` or `lavaanList`. See also `lavOptions`. Note that `lavaanList` provides parallel computing options, as well as a `FUN` argument so the user can extract custom output after the model is fitted to each imputed data set (see Examples). TIP: If a custom `FUN` is used and `parallel = "snow"` is requested, the user-supplied function should explicitly call `library` or use `::` for any functions not part of the base distribution.
`m`	`integer`. Request the number of imputations. Ignored if `data` is already a `list` of imputed data sets or a `lavaan.mi` object.
`miArgs`	Addition arguments for the multiple-imputation function (`miPackage`). The arguments should be put in a list (see example below). Ignored if `data` is already a `list` of imputed data sets or a `lavaan.mi` object.
`miPackage`	Package to be used for imputation. Currently these functions only support `"Amelia"` or `"mice"` for imputation. Ignored if `data` is already a `list` of imputed data sets or a `lavaan.mi` object.
`seed`	`integer`. Random number seed to be set before imputing the data. Ignored if `data` is already a `list` of imputed data sets or a `lavaan.mi` object.

Value

A lavaan.mi object

Author(s)

Terrence D. Jorgensen (University of Amsterdam; TJorgensen314@gmail.com)

References

Enders, C. K. (2010). Applied missing data analysis. New York, NY: Guilford.

Rubin, D. B. (1987). Multiple imputation for nonresponse in surveys. New York, NY: Wiley.

Examples

 ## Not run: 
## impose missing data for example
HSMiss <- HolzingerSwineford1939[ , c(paste("x", 1:9, sep = ""),
                                      "ageyr","agemo","school")]
set.seed(12345)
HSMiss$x5 <- ifelse(HSMiss$x5 <= quantile(HSMiss$x5, .3), NA, HSMiss$x5)
age <- HSMiss$ageyr + HSMiss$agemo/12
HSMiss$x9 <- ifelse(age <= quantile(age, .3), NA, HSMiss$x9)

## specify CFA model from lavaan's ?cfa help page
HS.model <- '
  visual  =~ x1 + x2 + x3
  textual =~ x4 + x5 + x6
  speed   =~ x7 + x8 + x9
'

## impute data within runMI...
out1 <- cfa.mi(HS.model, data = HSMiss, m = 20, seed = 12345,
               miArgs = list(noms = "school"))

## ... or impute missing data first
library(Amelia)
set.seed(12345)
HS.amelia <- amelia(HSMiss, m = 20, noms = "school", p2s = FALSE)
imps <- HS.amelia$imputations
out2 <- cfa.mi(HS.model, data = imps)

## same results (using the same seed results in the same imputations)
cbind(impute.within = coef(out1), impute.first = coef(out2))

summary(out1, fit.measures = TRUE)
summary(out1, ci = FALSE, fmi = TRUE, output = "data.frame")
summary(out1, ci = FALSE, stand = TRUE, rsq = TRUE)

## model fit. D3 includes information criteria
anova(out1)
## equivalently:
lavTestLRT.mi(out1)
## request D2
anova(out1, test = "D2")
## request fit indices
fitMeasures(out1)


## fit multigroup model without invariance constraints
mgfit.config <- cfa.mi(HS.model, data = imps, estimator = "mlm",
                       group = "school")
## add invariance constraints, and use previous fit as "data"
mgfit.metric <- cfa.mi(HS.model, data = mgfit.config, estimator = "mlm",
                       group = "school", group.equal = "loadings")
mgfit.scalar <- cfa.mi(HS.model, data = mgfit.config, estimator = "mlm",
                       group = "school",
                       group.equal = c("loadings","intercepts"))

## compare fit of 2 models to test metric invariance
## (scaled likelihood ratio test)
lavTestLRT.mi(mgfit.metric, h1 = mgfit.config)
## To compare multiple models, you must use anova()
anova(mgfit.config, mgfit.metric, mgfit.scalar)
## or compareFit(), which also includes fit indices for comparison
## (optional: name the models)
compareFit(config = mgfit.config, metric = mgfit.metric,
           scalar = mgfit.scalar,
           argsLRT = list(test = "D2", method = "satorra.bentler.2010"))

## correlation residuals to investigate local misfit
resid(mgfit.scalar, type = "cor.bentler")
## modification indices for fixed parameters, to investigate local misfit
modindices.mi(mgfit.scalar)
## or lavTestScore.mi for modification indices about equality constraints
lavTestScore.mi(mgfit.scalar)

## Wald test of whether latent means are == (fix 3 means to zero in group 2)
eq.means <- ' .p70. == 0
              .p71. == 0
              .p72. == 0 '
lavTestWald.mi(mgfit.scalar, constraints = eq.means)



## ordered-categorical data
data(datCat)
lapply(datCat, class) # indicators already stored as ordinal
## impose missing values
set.seed(123)
for (i in 1:8) datCat[sample(1:nrow(datCat), size = .1*nrow(datCat)), i] <- NA

## impute ordinal missing data using mice package
library(mice)
set.seed(456)
miceImps <- mice(datCat)
## save imputations in a list of data.frames
impList <- list()
for (i in 1:miceImps$m) impList[[i]] <- complete(miceImps, action = i)

## fit model, save zero-cell tables and obsolete "WRMR" fit indices
catout <- cfa.mi(' f =~ 1*u1 + 1*u2 + 1*u3 + 1*u4 ', data = impList,
                 FUN = function(fit) {
                   list(wrmr = lavaan::fitMeasures(fit, "wrmr"),
                        zeroCells = lavaan::lavInspect(fit, "zero.cell.tables"))
                 })
summary(catout)
lavTestLRT.mi(catout, test = "D2", pool.robust = TRUE)
fitMeasures(catout, fit.measures = c("rmsea","srmr","cfi"),
            test = "D2", pool.robust = TRUE)

## extract custom output
sapply(catout@funList, function(x) x$wrmr) # WRMR for each imputation
catout@funList[[1]]$zeroCells # zero-cell tables for first imputation
catout@funList[[2]]$zeroCells # zero-cell tables for second imputation ...


## End(Not run)

[Package semTools version 0.5-6 Index]