std_selected {stdmod}R Documentation

Standardize Variables in a Regression Model

Description

Standardize, mean center, or scale by standard deviation selected variables in a regression model and refit the model

Usage

std_selected(lm_out, to_scale = NULL, to_center = NULL, to_standardize = NULL)

std_selected_boot(
  lm_out,
  to_scale = NULL,
  to_center = NULL,
  to_standardize = NULL,
  conf = 0.95,
  nboot = 100,
  boot_args = NULL,
  save_boot_est = TRUE,
  full_output = FALSE,
  do_boot = TRUE
)

Arguments

lm_out

The output from lm().

to_scale

The terms to be rescaled by standard deviation, specified by a formula as in lm(). For example, if the terms to be scaled are x1 and x3, use ~ x1 + x3. No need to specify the interaction term. To scale the outcome variable, list it on the right hand side as a predictor. Specify only the original variables. If NULL, then no terms will be rescaled by their standard deviations. Variables that are not numeric will be ignored. Default is NULL.

to_center

The terms to be mean centered, specified by a formula as in lm(). For example, if the terms to be centered is x1 and x3, use ~ x1 + x3. No need to specify the interaction term. To center the outcome variable, list it on the right hand side as a predictor. Specify only the original variables. If NULL, then no term will be centered. Default is NULL.

to_standardize

The terms to be standardized, specified by a formula as in lm(). For example, if the terms to be standardized is x1 and x3, use ~ x1 + x3. No need to specify the interaction term. To standardize the outcome variable, list it on the right hand side as a predictor. Specify only the original variables. This is a shortcut to to_center and to_scale. Listing a variable in to_standardize is equivalent to listing this variable in both to_center and to_scale. Default is NULL.

conf

The level of confidence for the confidence interval. Default is .95.

nboot

The number of bootstrap samples. Default is 100.

boot_args

A named list of arguments to be passed to boot::boot(). Default is NULL.

save_boot_est

If TRUE, the default, the bootstrap estimates will be saved in the element boot_est of the output.

full_output

Whether the full output from boot::boot() is returned. Default is FALSE. If TRUE, the full output from boot::boot() will be saved in the element boot_out of the output.

do_boot

Whether bootstrapping confidence intervals will be formed. Default is TRUE. If FALSE, all arguments related to bootstrapping will be ignored.

Details

std_selected() was originally developed to compute the standardized moderation effect and the standardized coefficients for other predictors given an lm() output (Cheung, Cheung, Lau, Hui, & Vong, 2022). It has been extended such that users can specify which variables in a regression model are to be mean-centered and/or rescaled by their standard deviations. If the model has one or more interaction terms, they will be formed after the transformation, yielding the correct standardized solution for a moderated regression model. Moreover, categorical predictors will be automatically skipped in mean-centering and rescaling.

Standardization is conducted when a variable is mean-centered and then rescaled by its standard deviation. Therefore, if the goal is to get the standardized solution of a moderated regression, users just instruct the function to standardize all non-categorical variables in the regression model.

std_selected_boot() is a wrapper of std_selected(). It calls std_selected() once for each bootstrap sample, and then computes the nonparametric bootstrap percentile confidence intervals (Cheung, Cheung, Lau, Hui, & Vong, 2022).

If do_boot is FALSE, then the object it returns is identical to that by std_selected().

This function intentionally does not have an argument for setting the seed for random number. Users are recommended to set the seed, e.g., using set.seed() before calling it, to ensure reproducibility.

Value

The updated lm() output, with the class std_selected added. It will be treated as a usual lm() object by most functions. These are the major additional element in the list:

Like std_selected(), std_selected_boot() returns the updated lm() output, with the class std_selected added. The output of std_selected_boot() contain these additional elements in the list:

Functions

Author(s)

Shu Fai Cheung https://orcid.org/0000-0002-9871-9448

References

Cheung, S. F., Cheung, S.-H., Lau, E. Y. Y., Hui, C. H., & Vong, W. N. (2022) Improving an old way to measure moderation effect in standardized units. Health Psychology, 41(7), 502-505. doi:10.1037/hea0001188

Examples


# Load a sample data set

dat <- test_x_1_w_1_v_1_cat1_n_500
head(dat)

# Do a moderated regression by lm
lm_raw <- lm(dv ~ iv*mod + v1 + cat1, dat)
summary(lm_raw)

# Mean center mod only
lm_cw <- std_selected(lm_raw, to_center = ~ mod)
summary(lm_cw)

# Mean center mod and iv
lm_cwx <- std_selected(lm_raw, to_center = ~ mod + iv)
summary(lm_cwx)

# Standardize both mod and iv
lm_stdwx <- std_selected(lm_raw, to_scale = ~ mod + iv,
                               to_center = ~ mod + iv)
summary(lm_stdwx)

# Standardize all variables except for categorical variables.
# Interaction terms are formed after standardization.
lm_std <- std_selected(lm_raw, to_scale = ~ .,
                               to_center = ~ .)
summary(lm_std)

# Use to_standardize as a shortcut
lm_stdwx2 <- std_selected(lm_raw, to_standardize = ~ mod + iv)
# The results are the same
coef(lm_stdwx)
coef(lm_stdwx2)
all.equal(coef(lm_stdwx), coef(lm_stdwx2))



dat <- test_x_1_w_1_v_1_cat1_n_500
head(dat)

# Do a moderated regression by lm
lm_raw <- lm(dv ~ iv*mod + v1 + cat1, dat)
summary(lm_raw)
# Standardize all variables as in std_selected above, and compute the
# nonparametric bootstrapping percentile confidence intervals.
set.seed(87053)
lm_std_boot <- std_selected_boot(lm_raw,
                                 to_scale = ~ .,
                                 to_center = ~ .,
                                 conf = .95,
                                 nboot = 100)
# In real analysis, nboot should be at least 2000.
summary(lm_std_boot)

# Use to_standardize as a shortcut
set.seed(87053)
lm_std_boot2 <- std_selected_boot(lm_raw,
                                  to_standardize = ~ .,
                                  conf = .95,
                                  nboot = 100)
# The results are the same
confint(lm_std_boot)
confint(lm_std_boot2)
all.equal(confint(lm_std_boot), confint(lm_std_boot2))



[Package stdmod version 0.2.10 Index]