plot_model {sjPlot} | R Documentation |
Plot regression models
Description
plot_model()
creates plots from regression models, either
estimates (as so-called forest or dot whisker plots) or marginal effects.
Usage
plot_model(
model,
type = c("est", "re", "eff", "emm", "pred", "int", "std", "std2", "slope", "resid",
"diag"),
transform,
terms = NULL,
sort.est = NULL,
rm.terms = NULL,
group.terms = NULL,
order.terms = NULL,
pred.type = c("fe", "re"),
mdrt.values = c("minmax", "meansd", "zeromax", "quart", "all"),
ri.nr = NULL,
title = NULL,
axis.title = NULL,
axis.labels = NULL,
legend.title = NULL,
wrap.title = 50,
wrap.labels = 25,
axis.lim = NULL,
grid.breaks = NULL,
ci.lvl = NULL,
se = NULL,
robust = FALSE,
vcov.fun = NULL,
vcov.type = NULL,
vcov.args = NULL,
colors = "Set1",
show.intercept = FALSE,
show.values = FALSE,
show.p = TRUE,
show.data = FALSE,
show.legend = TRUE,
show.zeroinf = TRUE,
value.offset = NULL,
value.size,
jitter = NULL,
digits = 2,
dot.size = NULL,
line.size = NULL,
vline.color = NULL,
p.threshold = c(0.05, 0.01, 0.001),
p.val = NULL,
p.adjust = NULL,
grid,
case,
auto.label = TRUE,
prefix.labels = c("none", "varname", "label"),
bpe = "median",
bpe.style = "line",
bpe.color = "white",
ci.style = c("whisker", "bar"),
std.response = TRUE,
...
)
get_model_data(
model,
type = c("est", "re", "eff", "pred", "int", "std", "std2", "slope", "resid", "diag"),
transform,
terms = NULL,
sort.est = NULL,
rm.terms = NULL,
group.terms = NULL,
order.terms = NULL,
pred.type = c("fe", "re"),
ri.nr = NULL,
ci.lvl = NULL,
colors = "Set1",
grid,
case = "parsed",
digits = 2,
...
)
Arguments
model |
A regression model object. Depending on the |
type |
Type of plot. There are three groups of plot-types:
Marginal Effects (related vignette)
Model diagnostics
Note: For mixed models, the diagnostic plots like linear relationship or check for Homoscedasticity, do not take the uncertainty of random effects into account, but is only based on the fixed effects part of the model. |
transform |
A character vector, naming a function that will be applied
on estimates and confidence intervals. By default, |
terms |
Character vector with the names of those terms from
|
sort.est |
Determines in which way estimates are sorted in the plot:
|
rm.terms |
Character vector with names that indicate which terms should
be removed from the plot. Counterpart to |
group.terms |
Numeric vector with group indices, to group coefficients. Each group of coefficients gets its own color (see 'Examples'). |
order.terms |
Numeric vector, indicating in which order the coefficients should be plotted. See examples in this package-vignette. |
pred.type |
Character, only applies for Marginal Effects plots
with mixed effects models. Indicates whether predicted values should be
conditioned on random effects ( |
mdrt.values |
Indicates which values of the moderator variable should be
used when plotting interaction terms (i.e.
|
ri.nr |
Numeric vector. If |
title |
Character vector, used as plot title. By default,
|
axis.title |
Character vector of length one or two (depending on the
plot function and type), used as title(s) for the x and y axis. If not
specified, a default labelling is chosen. Note: Some plot types
may not support this argument sufficiently. In such cases, use the returned
ggplot-object and add axis titles manually with
|
axis.labels |
Character vector with labels for the model terms, used as
axis labels. By default, |
legend.title |
Character vector, used as legend title for plots that have a legend. |
wrap.title |
Numeric, determines how many chars of the plot title are displayed in one line and when a line break is inserted. |
wrap.labels |
Numeric, determines how many chars of the value, variable or axis labels are displayed in one line and when a line break is inserted. |
axis.lim |
Numeric vector of length 2, defining the range of the plot
axis. Depending on plot-type, may effect either x- or y-axis. For
Marginal Effects plots, |
grid.breaks |
Numeric value or vector; if |
ci.lvl |
Numeric, the level of the confidence intervals (error bars).
Use |
se |
Logical, if |
robust |
Deprecated. Please use |
vcov.fun |
Variance-covariance matrix used to compute uncertainty
estimates (e.g., for robust standard errors). This argument accepts a
covariance matrix, a function which returns a covariance matrix, or a
string which identifies the function to be used to compute the covariance
matrix. See |
vcov.type |
Deprecated. The |
vcov.args |
List of arguments to be passed to the function identified by
the |
colors |
May be a character vector of color values in hex-format, valid
color value names (see
|
show.intercept |
Logical, if |
show.values |
Logical, whether values should be plotted or not. |
show.p |
Logical, adds asterisks that indicate the significance level of estimates to the value labels. |
show.data |
Logical, for Marginal Effects plots, also plots the raw data points. |
show.legend |
For Marginal Effects plots, shows or hides the legend. |
show.zeroinf |
Logical, if |
value.offset |
Numeric, offset for text labels to adjust their position relative to the dots or lines. |
value.size |
Numeric, indicates the size of value labels. Can be used
for all plot types where the argument |
jitter |
Numeric, between 0 and 1. If |
digits |
Numeric, amount of digits after decimal point when rounding estimates or values. |
dot.size |
Numeric, size of the dots that indicate the point estimates. |
line.size |
Numeric, size of the lines that indicate the error bars. |
vline.color |
Color of the vertical "zero effect" line. Default color is inherited from the current theme. |
p.threshold |
Numeric vector of length 3, indicating the treshold for
annotating p-values with asterisks. Only applies if
|
p.val |
Character specifying method to be used to calculate p-values. Defaults to "profile" for glm/polr models, otherwise "wald". |
p.adjust |
Character vector, if not |
grid |
Logical, if |
case |
Desired target case. Labels will automatically converted into the
specified character case. See |
auto.label |
Logical, if |
prefix.labels |
Indicates whether the value labels of categorical variables
should be prefixed, e.g. with the variable name or variable label. See
argument |
bpe |
For Stan-models (fitted with the rstanarm- or
brms-package), the Bayesian point estimate is, by default, the median
of the posterior distribution. Use |
bpe.style |
For Stan-models (fitted with the rstanarm- or
brms-package), the Bayesian point estimate is indicated as a small,
vertical line by default. Use |
bpe.color |
Character vector, indicating the color of the Bayesian
point estimate. Setting |
ci.style |
Character vector, defining whether inner and outer intervals
for Bayesion models are shown in boxplot-style ( |
std.response |
Logical, whether the response variable will also be
standardized if standardized coefficients are requested. Setting both
|
... |
Other arguments, passed down to various functions. Here is a list of supported arguments and their description in detail.
|
Details
Different Plot Types
type = "std"
Plots standardized estimates. See details below.
type = "std2"
Plots standardized estimates, however, standardization follows Gelman's (2008) suggestion, rescaling the estimates by dividing them by two standard deviations instead of just one. Resulting coefficients are then directly comparable for untransformed binary predictors.
type = "pred"
Plots estimated marginal means (or marginal effects). Simply wraps
ggpredict
. See also this package-vignette.type = "eff"
Plots estimated marginal means (or marginal effects). Simply wraps
ggeffect
. See also this package-vignette.type = "int"
A shortcut for marginal effects plots, where interaction terms are automatically detected and used as
terms
-argument. Furthermore, if the moderator variable (the second - and third - term in an interaction) is continuous,type = "int"
automatically chooses useful values based on themdrt.values
-argument, which are passed toterms
. Then,ggpredict
is called.type = "int"
plots the interaction term that appears first in the formula along the x-axis, while the second (and possibly third) variable in an interaction is used as grouping factor(s) (moderating variable). Usetype = "pred"
ortype = "eff"
and specify a certain order in theterms
-argument to indicate which variable(s) should be used as moderator. See also this package-vignette.type = "slope"
andtype = "resid"
Simple diagnostic-plots, where a linear model for each single predictor is plotted against the response variable, or the model's residuals. Additionally, a loess-smoothed line is added to the plot. The main purpose of these plots is to check whether the relationship between outcome (or residuals) and a predictor is roughly linear or not. Since the plots are based on a simple linear regression with only one model predictor at the moment, the slopes (i.e. coefficients) may differ from the coefficients of the complete model.
type = "diag"
For Stan-models, plots the prior versus posterior samples. For linear (mixed) models, plots for multicollinearity-check (Variance Inflation Factors), QQ-plots, checks for normal distribution of residuals and homoscedasticity (constant variance of residuals) are shown. For generalized linear mixed models, returns the QQ-plot for random effects.
Standardized Estimates
Default standardization is done by completely refitting the model on the
standardized data. Hence, this approach is equal to standardizing the
variables before fitting the model, which is particularly recommended for
complex models that include interactions or transformations (e.g., polynomial
or spline terms). When type = "std2"
, standardization of estimates
follows Gelman's (2008)
suggestion, rescaling the estimates by dividing them by two standard deviations
instead of just one. Resulting coefficients are then directly comparable for
untransformed binary predictors.
Value
Depending on the plot-type, plot_model()
returns a
ggplot
-object or a list of such objects. get_model_data
returns the associated data with the plot-object as tidy data frame, or
(depending on the plot-type) a list of such data frames.
References
Gelman A (2008) "Scaling regression inputs by dividing by two
standard deviations." Statistics in Medicine 27: 2865-2873.
http://www.stat.columbia.edu/~gelman/research/published/standardizing7.pdf
Aiken and West (1991). Multiple Regression: Testing and Interpreting Interactions.
Examples
# prepare data
if (requireNamespace("haven")) {
library(sjmisc)
data(efc)
efc <- to_factor(efc, c161sex, e42dep, c172code)
m <- lm(neg_c_7 ~ pos_v_4 + c12hour + e42dep + c172code, data = efc)
# simple forest plot
plot_model(m)
# grouped coefficients
plot_model(m, group.terms = c(1, 2, 3, 3, 3, 4, 4))
# keep only selected terms in the model: pos_v_4, the
# levels 3 and 4 of factor e42dep and levels 2 and 3 for c172code
plot_model(m, terms = c("pos_v_4", "e42dep [3,4]", "c172code [2,3]"))
}
# multiple plots, as returned from "diagnostic"-plot type,
# can be arranged with 'plot_grid()'
## Not run:
p <- plot_model(m, type = "diag")
plot_grid(p)
## End(Not run)
# plot random effects
if (require("lme4") && require("glmmTMB")) {
m <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy)
plot_model(m, type = "re")
# plot marginal effects
plot_model(m, type = "pred", terms = "Days")
}
# plot interactions
## Not run:
m <- glm(
tot_sc_e ~ c161sex + c172code * neg_c_7,
data = efc,
family = poisson()
)
# type = "int" automatically selects groups for continuous moderator
# variables - see argument 'mdrt.values'. The following function call is
# identical to:
# plot_model(m, type = "pred", terms = c("c172code", "neg_c_7 [7,28]"))
plot_model(m, type = "int")
# switch moderator
plot_model(m, type = "pred", terms = c("neg_c_7", "c172code"))
# same as
# ggeffects::ggpredict(m, terms = c("neg_c_7", "c172code"))
## End(Not run)
# plot Stan-model
## Not run:
if (require("rstanarm")) {
data(mtcars)
m <- stan_glm(mpg ~ wt + am + cyl + gear, data = mtcars, chains = 1)
plot_model(m, bpe.style = "dot")
}
## End(Not run)