sim_mediation {robmed}R Documentation

Generate data from a fitted mediation model

Description

Generate data from a fitted mediation model, using the obtained coefficient estimates as the true model coefficients for data generation.

Usage

sim_mediation(object, n, ...)

## S3 method for class 'fit_mediation'
sim_mediation(
  object,
  n = NULL,
  explanatory = c("sim", "boot"),
  errors = c("sim", "boot"),
  num_discrete = 10,
  ...
)

## S3 method for class 'test_mediation'
sim_mediation(object, n = NULL, ...)

rmediation(n, object, ...)

Arguments

object

an object inheriting from class "fit_mediation" or "test_mediation" containing results from (robust) mediation analysis.

n

an integer giving the number of observations to be generated. If NULL (the default), the number of observations is taken from the data set used in the fitted mediation model from object.

...

additional arguments to be passed down.

explanatory

a character string specifying how to generate the explanatory variables (i.e., the independent variables and additional covariates). Possible values are "sim" to draw each explanatory variable independently from a certain distribution (the default), or "boot" to bootstrap the explanatory variables from the observed data (i.e., random sampling with replacement). See ‘Details’ for more information on how the data are generated.

errors

a character string specifying how to generate the error terms in the linear models for the mediators and the dependent variable. Possible values are "sim" to draw the error terms independently from the respective fitted model distribution (the default), or "boot" to bootstrap the error terms from the observed residuals in the respective fitted model (i.e., random sampling with replacement). See ‘Details’ for more information on how the data are generated.

num_discrete

integer; if the explanatory variables are drawn from distributions (explanatory = "sim"), variables that take num_discrete or fewer values are considered discrete (the default is 10). In that case, the corresponding variables are drawn from multinomial distributions with the relative frequencies from the observed data. This is only relevant if the mediation model was fitted via regressions and ignored if the mediation model was fitted via the covariance matrix, as the latter method assumes multivariate normality.

Details

The data generating process consists of three basic steps:

  1. Generate the explanatory variables (i.e., the independent variables and additional covariates).

  2. Generate the error terms of the different regression models.

  3. Generate the mediators and the dependent variable from the respective regression models, using the coefficient estimates from the fitted mediation model as the true model coefficients.

If explanatory = "sim", the explanatory variables are simulated as follows. For each variable, a regression on a constant term is performed, using the same estimator and assumed error distribution as in the fitted mediation model from object. Typically, the assumed error distribution is normal, but it can also be a skew-normal, t, or skew-t distribution, or a selection of the best-fitting error distribution. Using the obtained location estimate and parameter estimates of the assumed error distribution, values are drawn from this error distribution and added to the location estimate. It is important to note that all explanatory variables are simulated independently from each other, hence there are no correlations between the explanatory variables.

In order to generate correlated explanatory variables, it is recommended bootstrap the explanatory variables from the observed data by setting explanatory = "boot".

If errors = "sim", the error terms of the different regression models are drawn from the assumed error distribution in the fitted mediation model from object, using the respective parameter estimates. Typically, the assumed error distribution is normal, but it can also be a skew-normal, t, or skew-t distribution, or a selection of the best-fitting error distribution.

If errors = "boot", bootstrapping the error terms from the observed residuals is done independently for the different regression models and, if also explanatory = "boot", independently from bootstrapping the explanatory variables.

The "boot_test_mediation" method for results of a bootstrap test always uses the regression coefficient estimates obtained on the original data for data generation, not the bootstrap estimates. Keep in mind that all bootstrap estimates are the means of the respective bootstrap replicates. If the bootstrap estimates of the regression coefficients were used to generate the data, the true values of the indirect effects for the generated data (i.e., the products of the corresponding bootstrap coefficient estimates) would not be equal to the reported bootstrap estimates of the indirect effects in object, which could lead to confusion. For the estimates on the original data, it of course holds that the estimates of indirect effects are the products of the corresponding coefficient estimates.

Value

A data frame with n observations containing simulated data for the variables of the fitted mediation model.

Mediation models

The following mediation models are implemented. In the regression equations below, the i_j are intercepts and the e_j are random error terms.

Note

Function sim_mediation() takes the object containing results from mediation analysis as its first argument so that it can easily be used with the pipe operator (R's built-in |> or magrittr's %>%).

Function rmediation() is a wrapper conforming with the naming convention for functions that generate data, as well as the convention of those function to take the number of observations as the first argument.

Author(s)

Andreas Alfons

See Also

fit_mediation(), test_mediation()

Examples

data("BSG2014")

## simple mediation
# fit the mediation model
fit_simple <- fit_mediation(BSG2014,
                            x = "ValueDiversity",
                            y = "TeamCommitment",
                            m = "TaskConflict")
# simulate data from the fitted mediation model
sim_simple <- sim_mediation(fit_simple, n = 100)
head(sim_simple)

## serial multiple mediators
# fit the mediation model
fit_serial <- fit_mediation(BSG2014,
                            x = "ValueDiversity",
                            y = "TeamScore",
                            m = c("TaskConflict",
                                  "TeamCommitment"),
                            model = "serial")
# simulate data from the fitted mediation model
sim_serial <- sim_mediation(fit_serial, n = 100)
head(sim_serial)

## parallel multiple mediators and control variables
# fit the mediation model
fit_parallel <- fit_mediation(BSG2014,
                              x = "SharedLeadership",
                              y = "TeamPerformance",
                              m = c("ProceduralJustice",
                                    "InteractionalJustice"),
                              covariates = c("AgeDiversity",
                                             "GenderDiversity"),
                              model = "parallel")
# simulate data from the fitted mediation model
# (here the explanatory variables are bootstrapped
# to maintain the correlations between them)
sim_parallel <- sim_mediation(fit_parallel, n = 100,
                              explanatory = "boot")
head(sim_parallel)


[Package robmed version 1.0.2 Index]