mmi {causal.decomp}R Documentation

Multiple-Mediator-Imputation Estimation Method

Description

'mmi' is used to estimate the initial disparity, disparity reduction, and disparity remaining for causal decomposition analysis, using the multiple-mediator-imputation estimation method proposed by Park et al. (2020).

Usage

mmi(fit.r = NULL, fit.x, fit.y, treat, covariates, sims = 100, conf.level = .95,
    conditional = TRUE, cluster = NULL, long = TRUE, mc.cores = 1L, seed = NULL)

Arguments

fit.r

a fitted model object for treatment. Can be of class 'CBPS' or 'SumStat'. Default is 'NULL'. Only necessary if 'conditional' is 'FALSE'.

fit.x

a fitted model object for intermediate confounder(s). Each intermediate model can be of class 'lm', 'glm', 'multinom', or 'polr'. When multiple confounders are considered, can be of class 'list' containing multiple models.

fit.y

a fitted model object for outcome. Can be of class 'lm' or 'glm'.

treat

a character string indicating the name of the treatment variable used in the models. The treatment can be categorical with two or more categories (two- or multi-valued factor).

covariates

a vector containing the name of the covariate variable(s) used in the models. Each covariate can be categorical with two or more categories (two- or multi-valued factor) or continuous (numeric).

sims

number of Monte Carlo draws for nonparametric bootstrap.

conf.level

level of the returned two-sided confidence intervals, which are estimated by the nonparametric percentile bootstrap method. Default is .95, which returns the 2.5 and 97.5 percentiles of the simulated quantities.

conditional

a logical value. If 'TRUE', the function will return the estimates conditional on those covariate values, and all covariates in mediator and outcome models need to be centered prior to fitting. Default is 'TRUE'. If 'FALSE', 'fit.r' needs to be specified.

cluster

a vector of cluster indicators for the bootstrap. If provided, the cluster bootstrap is used. Default is 'NULL'.

long

a logical value. If 'TRUE', the output will contain the entire sets of estimates for all bootstrap samples. Default is 'TRUE'.

mc.cores

The number of cores to use. Must be exactly 1 on Windows.

seed

seed number for the reproducibility of results. Default is ‘NULL’.

Details

This function returns the point estimates of the initial disparity, disparity reduction, and disparity remaining for a categorical treatment and a variety of types of outcome and mediator(s) in causal decomposition analysis. It also returns nonparametric percentile bootstrap confidence intervals for each estimate.

The initial disparity represents the expected difference in an outcome between a comparison group R=j and a reference group R=i where i \neq j. That is,

\tau(i,j) \ = \ E\{Y|R=j\} - E\{Y|R=i\},

where R and Y are the group indicator and the outcome variable, respectively. The disparity reduction represents the expected change in an outcome for the group R=j after adjusting the level of mediator(s) to the level of the reference group. That is,

\delta(j) \ = \ E\{Y|R=j\} - E\{Y(G_M(i))|R=j\},

where G_M(i) is a random draw from the mediator distribution of the reference group. The disparity remaining represents the remaining disparity for the group R=j even after adjusting the level of mediators to the reference group. Formally,

\zeta(i) \ = \ E\{Y(G_M(i))|R=j\} - E\{Y|R=i\}.

The disparity reduction and remaining can be estimated using the multiple-mediator-imputation method suggested by Park et al. (2020). See the references for more details.

If one wants to make the inference conditional on baseline covariates, set 'conditional = TRUE' and center the data before fitting the models.

As of version 0.1.0, the intetmediate confounder model ('fit.x') can be of class 'lm', 'glm', 'multinom', or 'polr', corresponding respectively to the linear regression models and generalized linear models, multinomial log-linear models, and ordered response models. The outcome model ('fit.y') can be of class 'lm' or 'glm'. Also, the treatment model ('fit.r') can be of class 'CBPS' or 'SumStat', both of which use the propensity score weighting. It is only necessary when 'conditional = FALSE'.

Value

result

a matrix containing the point estimates of the initial disparity, disparity remaining, and disparity reduction, and the percentile bootstrap confidence intervals for each estimate.

all.result

a matrix containing the point estimates of the initial disparity, disparity remaining, and disparity reduction for all bootstrap samples. Returned if 'long' is 'TRUE'.

Author(s)

Suyeon Kang, University of California, Riverside, skang062@ucr.edu; Soojin Park, University of California, Riverside, soojinp@ucr.edu.

References

Park, S., Lee, C., and Qin, X. (2020). "Estimation and sensitivity analysis for causal decomposition in heath disparity research", Sociological Methods & Research, 00491241211067516.

Park, S., Kang, S., and Lee, C. (2021+). "Choosing an optimal method for causal decomposition analysis: A better practice for identifying contributing factors to health disparities". arXiv preprint arXiv:2109.06940.

See Also

smi

Examples

data(sdata)

#------------------------------------------------------------------------------#
# Example 1-a: Continuous Outcome
#------------------------------------------------------------------------------#
fit.m1 <- lm(M.num ~ R + C.num + C.bin, data = sdata)
fit.m2 <- glm(M.bin ~ R + C.num + C.bin, data = sdata,
          family = binomial(link = "logit"))
require(MASS)
fit.m3 <- polr(M.cat ~ R + C.num + C.bin, data = sdata)
fit.x1 <- lm(X ~ R + C.num + C.bin, data = sdata)
require(nnet)
fit.m4 <- multinom(M.cat ~ R + C.num + C.bin, data = sdata)
fit.y1 <- lm(Y.num ~ R + M.num + M.bin + M.cat + X + C.num + C.bin,
          data = sdata)

require(PSweight)
fit.r1 <- SumStat(R ~ C.num + C.bin, data = sdata, weight = "IPW")
require(CBPS)
fit.r2 <- CBPS(R ~ C.num + C.bin, data = sdata, method = "exact",
          standardize = "TRUE")

res.1a <- mmi(fit.r = fit.r1, fit.x = fit.x1,
          fit.y = fit.y1, sims = 40, conditional = FALSE,
          covariates = c("C.num", "C.bin"), treat = "R", seed = 111)
res.1a

#------------------------------------------------------------------------------#
# Example 1-b: Binary Outcome
#------------------------------------------------------------------------------#
fit.y2 <- glm(Y.bin ~ R + M.num + M.bin + M.cat + X + C.num + C.bin,
          data = sdata, family = binomial(link = "logit"))

res.1b <- mmi(fit.r = fit.r1, fit.x = fit.x1,
          fit.y = fit.y2, sims = 40, conditional = FALSE,
          covariates = c("C.num", "C.bin"), treat = "R", seed = 111)
res.1b

#------------------------------------------------------------------------------#
# Example 2-a: Continuous Outcome, Conditional on Covariates
#------------------------------------------------------------------------------#
# For conditional = TRUE, need to create data with centered covariates
# copy data
sdata.c <- sdata
# center quantitative covariate(s)
sdata.c$C.num <- scale(sdata.c$C.num, center = TRUE, scale = FALSE)
# center binary (or categorical) covariates(s)
# only neccessary if the desired baseline level is NOT the default baseline level.
sdata.c$C.bin <- relevel(sdata.c$C.bin, ref = "1")

# fit mediator and outcome models
fit.m1 <- lm(M.num ~ R + C.num + C.bin, data = sdata.c)
fit.m2 <- glm(M.bin ~ R + C.num + C.bin, data = sdata.c,
          family = binomial(link = "logit"))
fit.m3 <- polr(M.cat ~ R + C.num + C.bin, data = sdata.c)
fit.x2 <- lm(X ~ R + C.num + C.bin, data = sdata.c)
fit.y1 <- lm(Y.num ~ R + M.num + M.bin + M.cat + X + C.num + C.bin,
          data = sdata.c)

res.2a <- mmi(fit.x = fit.x2,
          fit.y = fit.y1, sims = 40, conditional = TRUE,
          covariates = c("C.num", "C.bin"), treat = "R", seed = 111)
res.2a

#------------------------------------------------------------------------------#
# Example 2-b: Binary Outcome, Conditional on Covariates
#------------------------------------------------------------------------------#
fit.y2 <- glm(Y.bin ~ R + M.num + M.bin + M.cat + X + C.num + C.bin,
          data = sdata.c, family = binomial(link = "logit"))

res.2b <- mmi(fit.x = fit.x2,
          fit.y = fit.y2, sims = 40, conditional = TRUE,
          covariates = c("C.num", "C.bin"), treat = "R", seed = 111)
res.2b

[Package causal.decomp version 0.1.0 Index]