R: PROCESS for mediation and/or moderation analyses.

PROCESS {bruceR}

R Documentation

PROCESS for mediation and/or moderation analyses.

Description

To perform mediation, moderation, and conditional process (moderated mediation) analyses, people may use software like Mplus, SPSS "PROCESS" macro, and SPSS "MLmed" macro. Some R packages can also perform such analyses separately and in a complex way, including R package "mediation", R package "interactions", and R package "lavaan". Some other R packages or scripts/modules have been further developed to improve the convenience, including jamovi module "jAMM" (by Marcello Gallucci, based on the lavaan package), R package "processR" (by Keon-Woong Moon, not official, also based on the lavaan package), and R script file "process.R" (the official PROCESS R code by Andrew F. Hayes, but it is not yet an R package and has some bugs and limitations).

Here, the bruceR::PROCESS() function provides an alternative to performing mediation/moderation analyses in R. This function supports a total of 24 kinds of SPSS PROCESS models (Hayes, 2018) and also supports multilevel mediation/moderation analyses. Overall, it supports the most frequently used types of mediation, moderation, moderated moderation (3-way interaction), and moderated mediation (conditional indirect effect) analyses for (generalized) linear or linear mixed models.

Specifically, the bruceR::PROCESS() function fits regression models based on the data, variable names, and a few other arguments that users input (with no need to specify the PROCESS model number and no need to manually mean-center the variables). The function can automatically judge the model number/type and also conduct grand-mean centering before model building (using the bruceR::grand_mean_center() function).

This automatic grand-mean centering can be turned off by setting center=FALSE.

Note that this automatic grand-mean centering (1) makes the results of main effects accurate for interpretation; (2) does not change any results of model fit (it only affects the interpretation of main effects); (3) is only conducted in "PART 1" (for an accurate estimate of main effects) but not in "PART 2" because it is more intuitive and interpretable to use the raw values of variables for the simple-slope tests in "PART 2"; (4) is not optional to users because mean-centering should always be done when there is an interaction; (5) is not conflicted with group-mean centering because after group-mean centering the grand mean of a variable will also be 0, such that the automatic grand-mean centering (with mean = 0) will not change any values of the variable.

If you need to do group-mean centering, please do this before using PROCESS. bruceR::group_mean_center() is a useful function of group-mean centering. Remember that the automatic grand-mean centering in PROCESS never affects the values of a group-mean centered variable, which already has a grand mean of 0.

The bruceR::PROCESS() function uses:

the interactions::sim_slopes() function to estimate simple slopes (and conditional direct effects) in moderation, moderated moderation, and moderated mediation models (PROCESS Models 1, 2, 3, 5, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 58, 59, 72, 73, 75, 76).
the mediation::mediate() function to estimate (conditional) indirect effects in (moderated) mediation models (PROCESS Models 4, 5, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 58, 59, 72, 73, 75, 76).
the lavaan::sem() function to perform serial multiple mediation analysis (PROCESS Model 6).

If you use this function in your research and report its results in your paper, please cite not only bruceR but also the other R packages it uses internally (mediation, interactions, and/or lavaan).

Two parts of results are printed:

PART 1. Regression model summary (using bruceR::model_summary() to summarize the models)

PART 2. Mediation/moderation effect estimates (using one or a combination of the above packages and functions to estimate the effects)

To organize the PART 2 output, the results of Simple Slopes are titled in green, whereas the results of Indirect Path are titled in blue.

Disclaimer: Although this function is named after PROCESS, Andrew F. Hayes has no role in its design, and its development is independent from the official SPSS PROCESS macro and "process.R" script. Any error or limitation should be attributed to the three R packages/functions that bruceR::PROCESS() uses internally. Moreover, as mediation analyses include random processes (i.e., bootstrap resampling or Monte Carlo simulation), the results of mediation analyses are unlikely to be exactly the same across different software (even if you set the same random seed in different software).

Usage

PROCESS(
  data,
  y = "",
  x = "",
  meds = c(),
  mods = c(),
  covs = c(),
  clusters = c(),
  hlm.re.m = "",
  hlm.re.y = "",
  hlm.type = c("1-1-1", "2-1-1", "2-2-1"),
  med.type = c("parallel", "serial"),
  mod.type = c("2-way", "3-way"),
  mod.path = c("x-y", "x-m", "m-y", "all"),
  cov.path = c("y", "m", "both"),
  mod1.val = NULL,
  mod2.val = NULL,
  ci = c("boot", "bc.boot", "bca.boot", "mcmc"),
  nsim = 100,
  seed = NULL,
  center = TRUE,
  std = FALSE,
  digits = 3,
  file = NULL
)

Arguments

`data`	Data frame.
`y`, `x`	Variable name of outcome (Y) and predictor (X). It supports both continuous (numeric) and dichotomous (factor) variables.
`meds`	Variable name(s) of mediator(s) (M). Use `c()` to combine multiple mediators. It supports both continuous (numeric) and dichotomous (factor) variables. It allows an infinite number of mediators in parallel or 2~4 mediators in serial. * Order matters when `med.type="serial"` (PROCESS Model 6: serial mediation).
`mods`	Variable name(s) of 0~2 moderator(s) (W). Use `c()` to combine multiple moderators. It supports all types of variables: continuous (numeric), dichotomous (factor), and multicategorical (factor). * Order matters when `mod.type="3-way"` (PROCESS Models 3, 5.3, 11, 12, 18, 19, 72, and 73). ** Do not set this argument when `med.type="serial"` (PROCESS Model 6).
`covs`	Variable name(s) of covariate(s) (i.e., control variables). Use `c()` to combine multiple covariates. It supports all types of (and an infinite number of) variables.
`clusters`	HLM (multilevel) cluster(s): e.g., `"School"`, `c("Prov", "City")`, `c("Sub", "Item")`.
`hlm.re.m`, `hlm.re.y`	HLM (multilevel) random effect term of M model and Y model. By default, it converts `clusters` to `lme4` syntax of random intercepts: e.g., `"(1 \| School)"` or `"(1 \| Sub) + (1 \| Item)"`. You may specify these arguments to include more complex terms: e.g., random slopes `"(X \| School)"`, or 3-level random effects `"(1 \| Prov/City)"`.
`hlm.type`	HLM (multilevel) mediation type (levels of "X-M-Y"): `"1-1-1"` (default), `"2-1-1"` (indeed the same as `"1-1-1"` in a mixed model), or `"2-2-1"` (currently not fully supported, as limited by the `mediation` package). In most cases, no need to set this argument.
`med.type`	Type of mediator: `"parallel"` (default) or `"serial"` (only relevant to PROCESS Model 6). Partial matches of `"p"` or `"s"` also work. In most cases, no need to set this argument.
`mod.type`	Type of moderator: `"2-way"` (default) or `"3-way"` (relevant to PROCESS Models 3, 5.3, 11, 12, 18, 19, 72, and 73). Partial matches of `"2"` or `"3"` also work.
`mod.path`	Which path(s) do the moderator(s) influence? `"x-y"`, `"x-m"`, `"m-y"`, or any combination of them (use `c()` to combine), or `"all"` (i.e., all of them). No default value.
`cov.path`	Which path(s) do the control variable(s) influence? `"y"`, `"m"`, or `"both"` (default).
`mod1.val`, `mod2.val`	By default (`NULL`), it uses Mean +/- SD of a continuous moderator (numeric) or all levels of a dichotomous/multicategorical moderator (factor) to perform simple slope analyses and/or conditional mediation analyses. You may manually specify a vector of certain values: e.g., `mod1.val=c(1, 3, 5)` or `mod1.val=c("A", "B", "C")`.
`ci`	Method for estimating the standard error (SE) and 95% confidence interval (CI) of indirect effect(s). Defaults to `"boot"` for (generalized) linear models or `"mcmc"` for (generalized) linear mixed models (i.e., multilevel models). `"boot"` Percentile Bootstrap `"bc.boot"` Bias-Corrected Percentile Bootstrap `"bca.boot"` Bias-Corrected and Accelerated (BCa) Percentile Bootstrap `"mcmc"` Markov Chain Monte Carlo (Quasi-Bayesian) * Note that these methods never apply to the estimates of simple slopes. You should not report the 95% CIs of simple slopes as Bootstrap or Monte Carlo CIs, because they are just standard CIs without any resampling method.
`nsim`	Number of simulation samples (bootstrap resampling or Monte Carlo simulation) for estimating SE and 95% CI. Defaults to `100` for running examples faster. In formal analyses, however, `nsim=1000` (or larger) is strongly suggested!
`seed`	Random seed for obtaining reproducible results. Defaults to `NULL`. You may set to any number you prefer (e.g., `seed=1234`, just an uncountable number). * Note that all mediation models include random processes (i.e., bootstrap resampling or Monte Carlo simulation). To get exactly the same results between runs, you need to set a random seed. However, even if you set the same seed number, it is unlikely to get exactly the same results across different R packages (e.g., `lavaan` vs. `mediation`) and software (e.g., SPSS, Mplus, R, jamovi).
`center`	Centering numeric (continuous) predictors? Defaults to `TRUE` (suggested).
`std`	Standardizing variables to get standardized coefficients? Defaults to `FALSE`. If `TRUE`, it will standardize all numeric (continuous) variables before building regression models. However, it is not suggested to set `std=TRUE` for generalized linear (mixed) models.
`digits`	Number of decimal places of output. Defaults to `3`.
`file`	File name of MS Word (`.doc`). Currently, only regression model summary can be saved.

Details

For more details and illustrations, see PROCESS-bruceR-SPSS (PDF and Markdown files).

Value

Invisibly return a list of results:

process.id: PROCESS model number.
process.type: PROCESS model type.
model.m: "Mediator" (M) models (a list of multiple models).
model.y: "Outcome" (Y) model.
results: Effect estimates and other results (unnamed list object).

References

Hayes, A. F. (2018). Introduction to mediation, moderation, and conditional process analysis (second edition): A regression-based approach. Guilford Press.

Yzerbyt, V., Muller, D., Batailler, C., & Judd, C. M. (2018). New recommendations for testing indirect effects in mediational models: The need to report and test component paths. Journal of Personality and Social Psychology, 115(6), 929–943.

Examples

#### NOTE ####
## In the following examples, I set nsim=100 to save time.
## In formal analyses, nsim=1000 (or larger) is suggested!

#### Demo Data ####
# ?mediation::student
data = mediation::student %>%
  dplyr::select(SCH_ID, free, smorale, pared, income,
                gender, work, attachment, fight, late, score)
names(data)[2:3] = c("SCH_free", "SCH_morale")
names(data)[4:7] = c("parent_edu", "family_inc", "gender", "partjob")
data$gender01 = 1 - data$gender  # 0 = female, 1 = male
# dichotomous X: as.factor()
data$gender = factor(data$gender01, levels=0:1, labels=c("Female", "Male"))
# dichotomous Y: as.factor()
data$pass = as.factor(ifelse(data$score>=50, 1, 0))

#### Descriptive Statistics and Correlation Analyses ####
Freq(data$gender)
Freq(data$pass)
Describe(data)     # file="xxx.doc"
Corr(data[,4:11])  # file="xxx.doc"

#### PROCESS Analyses ####

## Model 1 ##
PROCESS(data, y="score", x="late", mods="gender")  # continuous Y
PROCESS(data, y="pass", x="late", mods="gender")   # dichotomous Y

# (multilevel moderation)
PROCESS(data, y="score", x="late", mods="gender",  # continuous Y (LMM)
        clusters="SCH_ID")
PROCESS(data, y="pass", x="late", mods="gender",   # dichotomous Y (GLMM)
        clusters="SCH_ID")

# (Johnson-Neyman (J-N) interval and plot)
PROCESS(data, y="score", x="gender", mods="late") -> P
P$results[[1]]$jn[[1]]       # Johnson-Neyman interval
P$results[[1]]$jn[[1]]$plot  # Johnson-Neyman plot (ggplot object)
GLM_summary(P$model.y)       # detailed results of regression

# (allows multicategorical moderator)
d = airquality
d$Month = as.factor(d$Month)  # moderator: factor with levels "5"~"9"
PROCESS(d, y="Temp", x="Solar.R", mods="Month")

## Model 2 ##
PROCESS(data, y="score", x="late",
        mods=c("gender", "family_inc"),
        mod.type="2-way")  # or omit "mod.type", default is "2-way"

## Model 3 ##
PROCESS(data, y="score", x="late",
        mods=c("gender", "family_inc"),
        mod.type="3-way")
PROCESS(data, y="pass", x="gender",
        mods=c("late", "family_inc"),
        mod1.val=c(1, 3, 5),     # moderator 1: late
        mod2.val=seq(1, 15, 2),  # moderator 2: family_inc
        mod.type="3-way")

## Model 4 ##
PROCESS(data, y="score", x="parent_edu",
        meds="family_inc", covs="gender",
        ci="boot", nsim=100, seed=1)

# (allows an infinite number of multiple mediators in parallel)
PROCESS(data, y="score", x="parent_edu",
        meds=c("family_inc", "late"),
        covs=c("gender", "partjob"),
        ci="boot", nsim=100, seed=1)

# (multilevel mediation)
PROCESS(data, y="score", x="SCH_free",
        meds="late", clusters="SCH_ID",
        ci="mcmc", nsim=100, seed=1)

## Model 6 ##
PROCESS(data, y="score", x="parent_edu",
        meds=c("family_inc", "late"),
        covs=c("gender", "partjob"),
        med.type="serial",
        ci="boot", nsim=100, seed=1)

## Model 8 ##
PROCESS(data, y="score", x="fight",
        meds="late",
        mods="gender",
        mod.path=c("x-m", "x-y"),
        ci="boot", nsim=100, seed=1)

## For more examples and details, see the "note" subfolder at:
## https://github.com/psychbruce/bruceR/tree/main/note

[Package bruceR version 2024.6 Index]