R: Multiple Imputation for Missing Data in Meta-Analysis

metami {metavcov}

R Documentation

Multiple Imputation for Missing Data in Meta-Analysis

Description

Multiple imputation allows for the uncertainty about the missing data by generating several different plausible imputed data sets and appropriately combining results obtained from each of them. Let \hat{\theta}_{*m} be the estimated coefficient from the mth imputed dataset for one of the p dimensions in the multivariate outcome, where m=1,\dots,M. The coefficient from MI \bar{\theta} is simply just an arithmetic mean of the individual coefficients estimated from each of the M meta-analysis. We have

\bar{\theta}=\frac{\sum_{m=1}^{M}\hat{\theta}_{*m}}{M}.

Estimation of the standard error for each variable is little more complicated. Let V_W be the within imputation variance, which is the average of the variance of the estimated coefficient from each imputed dateset:

V_W=\frac{\sum_{m=1}^{M}V ({\hat{\theta}_{*m}})}{M},

where V ({\hat{\theta}_{*m}}) is the variance of the estimator calculated from generalized least squares methods using the imputed dataset. Let V_B be the between imputation variance, which is calculated as

V_B=\frac{\sum_{m=1}^{M}({\hat{\theta}_{*m}}-\bar{\theta})^2}{M-1}.

From V_W and V_B, the variance of the pooled coefficients is calculated as

V(\bar{\theta})=V_W+V_B+\frac{V_B}{M}

The above variance is statistically principled since V_W reflects the sampling variance and V_B reflects the extra variance due to the missing data.

Usage

metami(data, M = 20, vcov = "r.vcov",
        r.n.name, ef.name, x.name = NULL,
        rvcov.method = "average", rvcov.zscore = TRUE,
        type = NULL,
        d = NULL, sdt = NULL, sdc = NULL,
        nt = NULL, nc = NULL,
        st = NULL, sc = NULL,
        n_rt = NA, n_rc = NA,
        r = NULL,
        func = "mixmeta",
        formula = NULL,
        method = "fixed",
        pool.seq = NULL,
        return.mi = FALSE,
        ci.level = 0.95)

Arguments

`data`	A `N \times p` data frame that contains effect sizes and predictors for meta-regression, if any.
`M`	Number of imputed data sets.
`vcov`	Method for computing effect sizes; options including `vcov = "r.vcov"` for correlation coefficients and `vcov = "mix.vcov"` for other types of effect sizes. See `r.vcov` and `mix.vcov` for details.
`r.n.name`	A string defining the column name for sample sizes in `data` when the effect sizes are correlation coefficients (`vcov = "r.vcov"`).
`ef.name`	A `p`-dimensional vector that stores the column names for sample sizes in `data` when the effect sizes are correlation coefficients (`vcov = "r.vcov"`).
`x.name`	A vector that stores the column names in `data` for predictors for meta-regression.
`rvcov.method`	Method used for `r.vcov`; options including `"average"` and `"each"`.
`rvcov.zscore`	Whether the correlation coefficients in `data` are already transformed into Fisher's `z` scores.
`type`	A `p`-dimensional vector indicating types of effect sizes for the argument `vcov = "mix.vcov"`. `"MD"` stands for mean difference, `"SMD"` stands for standardized mean difference, `"logOR"` stands for log odds ratio, `"logRR"` stands for log risk ratio, and `"RD"` stands for risk difference.
`d`	A `p`-dimensional vector that stores the column names in `data` for continuous effect sizes such as MD or SMD. If outcome `j` is dichotomous, NA has to be imputed in for `d[j]`.
`sdt`	A `p`-dimensional vector that stores the column names in `data` for the sample standard deviations of each outcome from the treatment group. If outcome `j` is dichotomous, NA has to be imputed in for `d[j]`.
`sdc`	A vector defined in a similar way as `sdt` for the control group.
`nt`	A `p`-dimensional vector that stores the column names in `data` for sample sizes of `p` outcomes from treatment group.
`nc`	A vector defined in a similar way as `nt` for the control group.
`st`	A `p`-dimensional vector that stores the column names in `data` for the number of participants with event for all outcomes (dichotomous) in the treatment group. If outcome `j` is dichotomous, NA has to be imputed in for `st[j]`.
`sc`	A vector defined in a similar way as `st` for the control group.
`n_rt`	A `N`-dimensional list of `p \times p` correlation matrices storing sample sizes in the treatment group reporting pairwise outcomes in the off-diagonal elements. See `mix.vcov` for details.
`n_rc`	A list defined in a similar way as `n_rt` for the control group.
`r`	A `N`-dimensional list of `p \times p` correlation matrices for the `p` outcomes from the `N` studies. See `mix.vcov` for details.
`func`	A string defining the function to be used for fitting the meta-analysis. Options include `func = "metafixed"` for fixed-effect meta-analysis (see `metafixed` for details). `func = "mixmeta"`, for which the `mixmeta` package must be installed beforehand, and `func = "meta"`, for which the `metaSEM` package must be installed beforehand.
`formula`	Formula used for the function `func = "mixmeta"` from the `mixmeta` package when `func = "mixmeta"`.
`method`	Method used for the function `func = "mixmeta"` from the `mixmeta` package when `func = "mixmeta"`.
`pool.seq`	A numeric vector indicating if the results are pooled from subsets of the `M` data sets. By default, the results are only pooled from all `M` data sets.
`return.mi`	Should the `M` imputed data sets be returned?
`ci.level`	Significant level for the pooled confidence intervals. The default is 0.05.

Details

For the imputation phase, this function imports the mice package that imputes incomplete multivariate data by chained equations. The pooling phase is performed via the Rubin's rules.

Value

`coefficients`	A data.frame that contains the pooled results from the `M` imputed data sets.
`results.mi`	A `M`-dimensional list of results from each imputed data set.
`data.mi`	A `M`-dimensional list of imputed data sets if the argument `return.mi = TRUE`.
`result.seq`	A list of results from the pooled results from the subsets of the `M` imputed data sets if the argument `pool.seq = TRUE`.

Author(s)

Min Lu

References

Lu, M. (2023). Computing within-study covariances, data visualization, and missing data solutions for multivariate meta-analysis with metavcov. Frontiers in Psychology, 14:1185012.

Van Buuren, S. and Groothuis-Oudshoorn, K., 2011. mice: Multivariate imputation by chained equations in R. Journal of statistical software, 45(1), pp.1-67.

Examples


#####################################################################################
# Example: Craft2003 data
# Preparing input arguments for meta.mi() and fixed-effect model
#####################################################################################
# prepare a dataset with missing values and input arguments for meta.mi
Craft2003.mnar <- Craft2003[, c(2, 4:10)]
Craft2003.mnar[sample(which(Craft2003$C4 < 0), 6), "C4"] <- NA
dat <- Craft2003.mnar
n.name <- "N"
ef.name <- c("C1", "C2", "C3", "C4", "C5", "C6")
# fixed-effect model
obj <- metami(dat, M = 2, vcov = "r.vcov",
             n.name, ef.name,
             func = "metafixed")

########################
# Plotting the result
########################
computvcov <- r.vcov(n = Craft2003$N,
                     corflat = subset(Craft2003.mnar, select = C1:C6),
                     method = "average")
plotCI(y = computvcov$ef, v = computvcov$list.vcov,
        name.y = NULL, name.study = Craft2003$ID,
        y.all = obj$coefficients[,1],
        y.all.se = obj$coefficients[,2])

########################
# Pooling from subsets
########################
# o1 <- metami(dat, M = 10, vcov = "r.vcov",
#               n.name, ef.name,
#               func = "metafixed",
#               pool.seq = c(5, 10))
# pooled results from M = 5 imputed data sets
# o1$result.seq$M5$coefficients
# pooled results from M = 10 imputed data sets
# o1$result.seq$M10$coefficients
#########################################################################################
# Running random-effects and meta-regression model using packages "mixmeta" or "metaSEM"
#########################################################################################
# Restricted maximum likelihood (REML) estimator from the mixmeta package
# library(mixmeta)
# o2 <- metami(dat, M = 10, vcov = "r.vcov",
#                     n.name, ef.name,
#                     formula = as.formula(cbind(C1, C2, C3, C4, C5, C6) ~ 1),
#                     func = "mixmeta",
#                     method = "reml")
# maximum likelihood estimators from the metaSEM package
# library(metaSEM)
# o3 <- metami(dat, M = 10, vcov = "r.vcov",
#               n.name, ef.name,
#               func = "meta")

# meta-regression
# library(metaSEM)
# o4 <- metami(dat, M = 10, vcov = "r.vcov",
#                     n.name, ef.name, x.name = "p_male",
#                     func = "meta")
# library(mixmeta)
# o5 <- metami(dat, M = 20, vcov = "r.vcov",
#               n.name, ef.name, x.name = "p_male",
#               formula = as.formula(cbind(C1, C2, C3, C4, C5, C6) ~ p_male ),
#               func = "mixmeta",
#               method = "reml")
#####################################################################################
# Example: Geeganage2010 data
# Preparing input arguments for meta.mi() and fixed-effect model
#####################################################################################
# Geeganage2010.mnar <- Geeganage2010
# Geeganage2010.mnar$MD_SBP[sample(1:nrow(Geeganage2010),7)] <- NA
# r12 <- 0.71
# r13 <- 0.5
# r14 <- 0.25
# r23 <- 0.6
# r24 <- 0.16
# r34 <- 0.16
# r <- vecTosm(c(r12, r13, r14, r23, r24, r34))
# diag(r) <- 1
# mix.r <- lapply(1:nrow(Geeganage2010), function(i){r})
# o <- metami(data = Geeganage2010.mnar, M = 10, vcov = "mix.vcov",
#               ef.name = c("MD_SBP", "MD_DBP", "RD_DD", "lgOR_D"),
#               type = c("MD", "MD", "RD", "lgOR"),
#               d = c("MD_SBP", "MD_DBP", NA, NA),
#               sdt = c("sdt_SBP", "sdt_DBP", NA, NA),
#               sdc = c("sdc_SBP", "sdc_DBP", NA, NA),
#               nt = c("nt_SBP", "nt_DBP", "nt_DD", "nt_D"),
#               nc = c("nc_SBP", "nc_DBP", "nc_DD", "nc_D"),
#               st = c(NA, NA, "st_DD", "st_D"),
#               sc = c(NA, NA, "sc_DD", "sc_D"),
#               r = mix.r,
#               func = "metafixed")

[Package metavcov version 2.1.5 Index]