mma {mma}R Documentation

Multiple Mediation Analysis

Description

Test for mediators and do statistical inferences on the identified mediation effects.

Usage

mma(x,y,pred,mediator=NULL, contmed=NULL,binmed=NULL,binref=NULL,catmed=NULL,
    catref=NULL,jointm=NULL,cova=NULL, refy=rep(NA,ncol(data.frame(y))),
    predref=rep(NA,ncol(data.frame(pred))),alpha=0.1,alpha2=0.1, margin=1, n=20,
    nonlinear=FALSE,df1=1,nu=0.001,D=3,distn=NULL,
    family1=as.list(rep(NA,ncol(data.frame(y)))),n2=50,w=rep(1,nrow(x)),
    testtype=1, x.new=NULL, pred.new=NULL, cova.new=NULL, type=NULL,w.new=NULL, 
    all.model=FALSE,xmod=NULL,custom.function = NULL,para=FALSE,echo=TRUE)

Arguments

x

a data frame contains the predictor, all potential mediators and covariates. All varaibles in x that are not identified as potential mediators (by mediator, contmed, binmed, catmed, or jointm) are forced in mediation analysis as covariates.

y

the vector of outcome variable.

pred

the vector/matrix of the predictor(s).

mediator

the list of mediators (column numbers in x or by variable names). The mediators to be checked can be identified by "contmed", "binmed" and "catmed", or by this argument, "mediator", where binary and categorical mediators in x are identified by factors, the reference group is the first level of the factor.

contmed

a vector of column numbers that locate the potential continuous mediators in x.

binmed

a vector of column numbers that locate the potential binary mediators in x.

binref

the defined reference groups of the binary potential mediators in binmed.

catmed

a vector of column numbers that locate the potential categorical mediators in x.

catref

the defined reference groups of the categorical potential mediators in catmed.

jointm

a list that identifies the mediators that need to be considered jointly, where the first item indicates the number of groups of mediators to be considered jointly, and each of the following items identifies the column numbers of the mediators in x for each group of joint mediators.

cova

The data frame of covariates to be used to predict the mediator(s). The covariates in cova cannot be potential mediators. If the covariates are for some specific potential mediators, cova$for.m is the vector of names of potential mediators that use the covariates. Works for continuous predictors (pred) only, also the specified covariates should have no missing if only some mediators uses the covariates. Works mainly for continuous predictors (pred), also the specified covariates should have no missing if only some mediators uses the covariates.

refy

if y is binary, the reference group of y.By default, the reference group will be the first level of as.factor(y).

predref

if predictor is binary, identify the reference group of the binary predictor. By default, the reference group will be the first level of the predictor.

alpha

the significance level at which to test if the potential mediators (identified by contmed, binmed, and catmed) can be used as a covariate or mediator in estimating y when all variables in x are included in the model. The default value is alpha=0.1

alpha2

the significant level at which to test if a potential mediator is related with the predictor. The default value is alpha2=0.1.

margin

if binpred is FALSE, define the change in predictor when calculating the mediation effects, see Yu et al. (2014).

n

the time of resampling in calculating the indirect effects, default is n=20, see Yu et al. (2014).

nonlinear

if TURE, Multiple Additive Regression Trees (MART) will be used to fit the final full model in estimating the outcome. The default value of nonlinear is FALSE, in which case, a generalized linear model will be used to fit the final full model.

df1

if nonlinear is TURE, natural cubic spline will be used to fit the relationship between the countinuous predictor and each mediator. The df is the degree of freedom in the ns() function, the default is 1.

nu

set the parameter "interaction.depth" in gbm function if MART is to be used, by default, nu=0.001. See also help(gbm.fit).

D

set the parameter "shrinkage" in gbm function if MART is to be used, by default, D=3. See also help(gbm.fit).

distn

the assumed distribution of the outcome if MART is used for final full model. The default value of distn is "bernoulli" for binary y, and "gaussian" for continuous y.

family1

define the conditional distribution of y given x, and the linkage function that links the mean of y with the system component if generalized linear model is used as the final full model. The default value of family1 is binomial(link = "logit") for binary y, gaussian(link="identity") for continuous y.

n2

the number of times of bootstrap resampling. The default value is n2=50.

w

the weight for each observation.

testtype

if the testtype is 1 (by default), covariates/mediators are identified using full model; if the testtype is 2, covariates/mediators are tested one by one in models with the predictor only.

x.new

A new set of predictor and corresponding covariates, of the same format as x (after data.org), on which to calculate the mediation effects. For continuous predictor only. If is NULL, the mediation effects will be calculated based on the original data set.

pred.new

A new set of predictor(s), of the same format as x (after data.org), on which to calculate the mediation effects. For continuous predictor only.

cova.new

a new set of covaraite(s).

type

the type of prediction when y is class Surv. Is "risk" if not specified.

w.new

the weights for new.x.

all.model

save all the fitted model from bootstrap samples if TRUE. This needs to be true to make inferences on moderation effects.

xmod

If there is a moderator, xmod gives the moderator's name in cova and/or x.

custom.function

a string of customer defined final model for predicting the outcome(s). The response variable should be noted as "responseY", the dataset should be noted as "dataset123". The weights for observations should be noted as "weights123". The covariates should be in x or pred. Use "~." for all varaibles in x and pred is allowed. The customer defined model should be able to make prediction by using "predict(object, newdata=...)", where the object is the results of the fitted model. If a specific package needs to be called to fit the model, the user should call the package first. For example, if the gamlss package is used to fit a piecewise polynomial with "age" to be the mediator and "race" be the predictor, we can set custom.function="gamlss(responseV~b(age,df=3)+race,data=dataset123,tace=FALSE)". The length of custom.function should be the same as the dimension of y. If custom.function[j] is NA, the usual method will be used to fit y[j].

para

It is for binary predictors. If it is true, we would like the x-m relationship be fitted parametrically.

echo

If echo is FALSE, there is no counting printed for the number of bootstrap iterations.

Details

mma first tests if the potential mediators defined by binm, contm, and catm should be treated as mediators or covariates (if none, the variable will be deleted from further analysis). All variables identified by jointm are treated as mediators. All other variables in x that are not tested are treated as covariates. Then mma does the mediation effects estimation and inference on the selected variables.

Value

Returns an mma object.

estimation

list the estimation of ie (indirect effect), te (total effect), and de (direct effect from the predictor) separately.

bootsresults

a list where the first item, ie, is a matrix of n2 rows where each column gives the estimated indirect effect from the corresponding mediator (identified by the column name) from the n2 bootstrap samples; the second item, te, is a vector of estimated total effects from the bootstrap sample; and the 3rd item, de, is a vector of estimated direct effect of the predictor from the bootstrap sample.

model

a list where the first item, MART, is T if mart is fitted for the final model; the second item, Survival, is T if a survival model is fitted; the third item, type, is the type of prediction when a survival model is fitted; the fourth item, model, is the fitted final full model where y is the outcome and all predictor, covariates, and mediators are the explanatory variables; and the fourth item, best.iter is the number of best iterations if MART is used to fit the final model.

data

a list that contains all the used data: x=x, y=y, dirx=dirx, binm=binm, contm=contm, catm=catm, jointm=jointm, binpred=FALSE.

boot.detail

for continuous predictor only: a list that contains the mediation effects on each row of new.pred: new.pred=new.pred, te1, de1, ie1.

all_model

a list with all fitted models from bootstrap samples if all.model is TRUE.

all_iter

if all.model is TRUE, a matrix with all fitted best iterations if MART is used. Each row is from one bootstrap sample.

all_boot

if all.model is TRUE, a matrix with all bootstrap samples. Each row is one bootstrap sample.

Author(s)

Qingzhao Yu qyu@lsuhsc.edu

References

Baron, R.M., and Kenny, D.A. (1986) <doi:10.1037/0022-3514.51.6.1173>. "The moderator-mediator variable distinction in social psychological research: conceptual, strategic, and statistical considerations," J. Pers Soc Psychol, 51(6), 1173-1182.

J.H. Friedman, T. Hastie, R. Tibshirani (2000) <doi:10.1214/aos/1016120463>. "Additive Logistic Regression: a Statistical View of Boosting," Annals of Statistics 28(2):337-374.

J.H. Friedman (2001) <doi:10.1214/aos/1013203451>. "Greedy Function Approximation: A Gradient Boosting Machine," Annals of Statistics 29(5):1189-1232.

Yu, Q., Fan, Y., and Wu, X. (2014) <doi:10.4172/2155-6180.1000189>. "General Multiple Mediation Analysis With an Application to Explore Racial Disparity in Breast Cancer Survival," Journal of Biometrics & Biostatistics,5(2): 189.

Yu, Q., Scribner, R.A., Leonardi, C., Zhang, L., Park, C., Chen, L., and Simonsen, N.R. (2017) <doi:10.1016/j.sste.2017.02.001>. "Exploring racial disparity in obesity: a mediation analysis considering geo-coded environmental factors," Spatial and Spatio-temporal Epidemiology, 21, 13-23.

Yu, Q., and Li, B. (2017) <doi:10.5334/hors.160>. "mma: An r package for multiple mediation analysis," Journal of Open Research Software, 5(1), 11.

Yu, Q., Wu, X., Li, B., and Scribner, R. (2018). <doi:10.1002/sim.7977>. "Multiple Mediation Analysis with Survival Outcomes – With an Application to Explore Racial Disparity in Breast Cancer Survival," Statistics in Medicine.

Yu, Q., Medeiros, KL, Wu, X., and Jensen, R. (2018). <doi:10.1007/s11336-018-9612-2>. "Explore Ethnic Disparities in Anxiety and Depression Among Cancer Survivors Using Nonlinear Mediation Analysis," Psychometrika, 83(4), 991-1006.

See Also

"data.org" is for mediator tests, and "med" , and "boot.med" for mediation analysis and inferences.

Examples

data("weight_behavior")
#binary predictor
 #binary y
 x=weight_behavior[,c(2,4:14)]
 pred=weight_behavior[,3]
 y=weight_behavior[,15]
 temp.b.b.glm<-mma(x,y,pred=pred,contmed=c(7:9,11:12),binmed=c(6,10),binref=c(1,1),
                    catmed=5,catref=1,predref="M",alpha=0.4,alpha2=0.4,n=2,n2=2)
 
 temp.b.b.mart<-mma(x,y,pred=pred,contmed=c(7:9,11:12),binmed=c(6,10),binref=c(1,1),
                    catmed=5,catref=1,predref="M",alpha=0.4,alpha2=0.4,nonlinear=TRUE,n=2,n2=5)
 #continuous y
 x=weight_behavior[,c(2,4:14)]
 pred=weight_behavior[,3]
 y=data.frame(weight_behavior[,1])
 colnames(y)="bmi"
 temp.b.c.glm<-mma(x,y,pred=pred,mediator=5:12,jointm=list(n=1,j1=7:9), 
                     predref="M",alpha=0.4,alpha2=0.4,n2=20)
 temp.b.c.mart<-mma(x,y,pred=pred,mediator=5:12,jointm=list(n=1,j1=7:9), 
                     predref="M",alpha=0.4,alpha2=0.4,
                     n=2,nonlinear=TRUE,n2=20)


##Surv class outcome (survival analysis)
data(cgd1)       #a dataset in the survival package
x=cgd1[,c(4:5,7:12)]
pred=cgd1[,6]
status<-ifelse(is.na(cgd1$etime1),0,1)
y=Surv(cgd1$futime,status)          
#for continuous predictor
temp.cox.contx<-mma(x,y,pred=pred,mediator=1:ncol(x),      
                    alpha=0.5,alpha2=0.5,n=1,n2=2,type="lp") 
summary(temp.cox.contx)
temp.surv.mart.contx<-mma(x,y,pred=pred,mediator=1:ncol(x),      
                    alpha=0.5,alpha2=0.5,nonlinear=TRUE,n2=2) 

#for binary predictor
x=cgd1[,c(5:12)]
pred=cgd1[,4]
temp.cox.binx<-mma(x,y,pred=pred,mediator=1:ncol(x),   
                    alpha=0.4,alpha2=0.4,n=1,n2=2,type="lp") 
summary(temp.cox.binx)


[Package mma version 10.7-1 Index]