std_selected {stdmod} | R Documentation |
Standardize Variables in a Regression Model
Description
Standardize, mean center, or scale by standard deviation selected variables in a regression model and refit the model
Usage
std_selected(lm_out, to_scale = NULL, to_center = NULL, to_standardize = NULL)
std_selected_boot(
lm_out,
to_scale = NULL,
to_center = NULL,
to_standardize = NULL,
conf = 0.95,
nboot = 100,
boot_args = NULL,
save_boot_est = TRUE,
full_output = FALSE,
do_boot = TRUE
)
Arguments
lm_out |
The output from |
to_scale |
The terms to be rescaled by standard deviation,
specified by a formula as in |
to_center |
The terms to be mean centered, specified by a formula
as in |
to_standardize |
The terms to be standardized, specified by a formula
as in |
conf |
The level of confidence for the confidence interval. Default is .95. |
nboot |
The number of bootstrap samples. Default is 100. |
boot_args |
A named list of arguments to be passed to |
save_boot_est |
If |
full_output |
Whether the full output from |
do_boot |
Whether bootstrapping confidence intervals will be formed.
Default is |
Details
std_selected()
was originally developed to compute the standardized
moderation effect and the standardized coefficients for other predictors
given an lm()
output (Cheung, Cheung, Lau, Hui, & Vong, 2022).
It has been extended such that users can specify
which variables in a regression model are to be mean-centered and/or
rescaled by
their standard deviations. If the model has one or more interaction terms,
they will be formed after the transformation, yielding the correct
standardized solution for a moderated regression model. Moreover,
categorical predictors will be automatically skipped in mean-centering
and rescaling.
Standardization is conducted when a variable is mean-centered and then rescaled by its standard deviation. Therefore, if the goal is to get the standardized solution of a moderated regression, users just instruct the function to standardize all non-categorical variables in the regression model.
std_selected_boot()
is a wrapper of std_selected()
. It calls
std_selected()
once
for each bootstrap sample, and then computes the nonparametric
bootstrap
percentile confidence intervals (Cheung, Cheung, Lau, Hui, & Vong, 2022).
If do_boot
is FALSE
, then the object it returns is identical to that
by std_selected()
.
This function intentionally does not have an argument for setting the seed
for
random number. Users are recommended to set the seed, e.g., using
set.seed()
before calling it, to ensure reproducibility.
Value
The updated lm()
output, with the class std_selected
added. It will be
treated as a usual lm()
object by most functions. These are the major
additional element in the list:
-
scaled_terms
: If notNULL
, a character vector of the variables scaled. -
centered_terms
: If notNULL
, a character vector of the variables mean-centered. -
scaled_by
: A numeric vector of the scaling factors for all the variables in the model. The value is 1 for terms not scaled. -
centered_by
: A numeric vector of the numbers used for centering for all the variables in the model. The value is 0 for terms not centered. -
std_selected_call
: The original call. -
lm_out_call
: The call inlm_out
.
Like std_selected()
, std_selected_boot()
returns the updated lm()
output, with the class std_selected
added. The output of std_selected_boot()
contain these additional elements in the list:
-
boot_ci
: A data frame of the bootstrap confidence intervals of the regression coefficient. -
nboot
: The number of bootstrap samples requested. -
conf
: The level of confidence, in proportion. -
boot_est
: A matrix of the bootstrap estimates of the regression coefficients. The number of rows equal tonboot
, and the number of columns equal to the number of terms in the regression model. -
std_selected_boot_call
: The call tostd_selected_boot()
. -
boot_out
: If available, the original output fromboot::boot()
.
Functions
-
std_selected()
: The base function to center or scale selected variables in a regression model -
std_selected_boot()
: A wrapper ofstd_selected()
that forms nonparametric bootstrap confidence intervals.
Author(s)
Shu Fai Cheung https://orcid.org/0000-0002-9871-9448
References
Cheung, S. F., Cheung, S.-H., Lau, E. Y. Y., Hui, C. H., & Vong, W. N. (2022) Improving an old way to measure moderation effect in standardized units. Health Psychology, 41(7), 502-505. doi:10.1037/hea0001188
Examples
# Load a sample data set
dat <- test_x_1_w_1_v_1_cat1_n_500
head(dat)
# Do a moderated regression by lm
lm_raw <- lm(dv ~ iv*mod + v1 + cat1, dat)
summary(lm_raw)
# Mean center mod only
lm_cw <- std_selected(lm_raw, to_center = ~ mod)
summary(lm_cw)
# Mean center mod and iv
lm_cwx <- std_selected(lm_raw, to_center = ~ mod + iv)
summary(lm_cwx)
# Standardize both mod and iv
lm_stdwx <- std_selected(lm_raw, to_scale = ~ mod + iv,
to_center = ~ mod + iv)
summary(lm_stdwx)
# Standardize all variables except for categorical variables.
# Interaction terms are formed after standardization.
lm_std <- std_selected(lm_raw, to_scale = ~ .,
to_center = ~ .)
summary(lm_std)
# Use to_standardize as a shortcut
lm_stdwx2 <- std_selected(lm_raw, to_standardize = ~ mod + iv)
# The results are the same
coef(lm_stdwx)
coef(lm_stdwx2)
all.equal(coef(lm_stdwx), coef(lm_stdwx2))
dat <- test_x_1_w_1_v_1_cat1_n_500
head(dat)
# Do a moderated regression by lm
lm_raw <- lm(dv ~ iv*mod + v1 + cat1, dat)
summary(lm_raw)
# Standardize all variables as in std_selected above, and compute the
# nonparametric bootstrapping percentile confidence intervals.
set.seed(87053)
lm_std_boot <- std_selected_boot(lm_raw,
to_scale = ~ .,
to_center = ~ .,
conf = .95,
nboot = 100)
# In real analysis, nboot should be at least 2000.
summary(lm_std_boot)
# Use to_standardize as a shortcut
set.seed(87053)
lm_std_boot2 <- std_selected_boot(lm_raw,
to_standardize = ~ .,
conf = .95,
nboot = 100)
# The results are the same
confint(lm_std_boot)
confint(lm_std_boot2)
all.equal(confint(lm_std_boot), confint(lm_std_boot2))