stat_gradientinterval {ggdist}R Documentation

Gradient + interval plot (shortcut stat)

Description

Shortcut version of stat_slabinterval() with geom_slabinterval() for creating gradient + interval plots.

Roughly equivalent to:

stat_slabinterval(
  aes(
    justification = after_stat(0.5),
    thickness = after_stat(thickness(1)),
    slab_alpha = after_stat(f)
  ),
  fill_type = "auto",
  show.legend = c(size = FALSE, slab_alpha = FALSE)
)

If your graphics device supports it, it is recommended to use this stat with fill_type = "gradient" (see the description of that parameter). On R >= 4.2, support for fill_type = "gradient" should be auto-detected based on the graphics device you are using.

Usage

stat_gradientinterval(
  mapping = NULL,
  data = NULL,
  geom = "slabinterval",
  position = "identity",
  ...,
  fill_type = "auto",
  p_limits = c(NA, NA),
  density = "bounded",
  adjust = waiver(),
  trim = TRUE,
  expand = FALSE,
  breaks = waiver(),
  align = "none",
  outline_bars = FALSE,
  point_interval = "median_qi",
  slab_type = NULL,
  limits = NULL,
  n = 501,
  .width = c(0.66, 0.95),
  orientation = NA,
  na.rm = FALSE,
  show.legend = c(size = FALSE, slab_alpha = FALSE),
  inherit.aes = TRUE
)

Arguments

mapping

Set of aesthetic mappings created by aes(). If specified and inherit.aes = TRUE (the default), it is combined with the default mapping at the top level of the plot. You must supply mapping if there is no plot mapping.

data

The data to be displayed in this layer. There are three options:

If NULL, the default, the data is inherited from the plot data as specified in the call to ggplot().

A data.frame, or other object, will override the plot data. All objects will be fortified to produce a data frame. See fortify() for which variables will be created.

A function will be called with a single argument, the plot data. The return value must be a data.frame, and will be used as the layer data. A function can be created from a formula (e.g. ~ head(.x, 10)).

geom

Use to override the default connection between stat_gradientinterval() and geom_slabinterval()

position

Position adjustment, either as a string, or the result of a call to a position adjustment function. Setting this equal to "dodge" (position_dodge()) or "dodgejust" (position_dodgejust()) can be useful if you have overlapping geometries.

...

Other arguments passed to layer(). These are often aesthetics, used to set an aesthetic to a fixed value, like colour = "red" or linewidth = 3 (see Aesthetics, below). They may also be parameters to the paired geom/stat. When paired with the default geom, geom_slabinterval(), these include:

normalize

How to normalize heights of functions input to the thickness aesthetic. One of:

  • "all": normalize so that the maximum height across all data is 1.

  • "panels": normalize within panels so that the maximum height in each panel is 1.

  • "xy": normalize within the x/y axis opposite the orientation of this geom so that the maximum height at each value of the opposite axis is 1.

  • "groups": normalize within values of the opposite axis and within each group so that the maximum height in each group is 1.

  • "none": values are taken as is with no normalization (this should probably only be used with functions whose values are in [0,1], such as CDFs).

For a comprehensive discussion and examples of slab scaling and normalization, see the thickness scale article.

interval_size_domain

A length-2 numeric vector giving the minimum and maximum of the values of the size and linewidth aesthetics that will be translated into actual sizes for intervals drawn according to interval_size_range (see the documentation for that argument.)

interval_size_range

A length-2 numeric vector. This geom scales the raw size aesthetic values when drawing interval and point sizes, as they tend to be too thick when using the default settings of scale_size_continuous(), which give sizes with a range of c(1, 6). The interval_size_domain value indicates the input domain of raw size values (typically this should be equal to the value of the range argument of the scale_size_continuous() function), and interval_size_range indicates the desired output range of the size values (the min and max of the actual sizes used to draw intervals). Most of the time it is not recommended to change the value of this argument, as it may result in strange scaling of legends; this argument is a holdover from earlier versions that did not have size aesthetics targeting the point and interval separately. If you want to adjust the size of the interval or points separately, you can also use the linewidth or point_size aesthetics; see sub-geometry-scales.

fatten_point

A multiplicative factor used to adjust the size of the point relative to the size of the thickest interval line. If you wish to specify point sizes directly, you can also use the point_size aesthetic and scale_point_size_continuous() or scale_point_size_discrete(); sizes specified with that aesthetic will not be adjusted using fatten_point.

arrow

grid::arrow() giving the arrow heads to use on the interval, or NULL for no arrows.

subguide

Sub-guide used to annotate the thickness scale. One of:

  • A function that takes a scale argument giving a ggplot2::Scale object and an orientation argument giving the orientation of the geometry and then returns a grid::grob that will draw the axis annotation, such as subguide_axis() (to draw a traditional axis) or subguide_none() (to draw no annotation). See subguide_axis() for a list of possibilities and examples.

  • A string giving the name of such a function when prefixed with "subguide"; e.g. "axis" or "none".

fill_type

What type of fill to use when the fill color or alpha varies within a slab. One of:

  • "segments": breaks up the slab geometry into segments for each unique combination of fill color and alpha value. This approach is supported by all graphics devices and works well for sharp cutoff values, but can give ugly results if a large number of unique fill colors are being used (as in gradients, like in stat_gradientinterval()).

  • "gradient": a grid::linearGradient() is used to create a smooth gradient fill. This works well for large numbers of unique fill colors, but requires R >= 4.1 and is not yet supported on all graphics devices. As of this writing, the png() graphics device with type = "cairo", the svg() device, the pdf() device, and the ragg::agg_png() devices are known to support this option. On R < 4.1, this option will fall back to fill_type = "segments" with a message.

  • "auto": attempts to use fill_type = "gradient" if support for it can be auto-detected. On R >= 4.2, support for gradients can be auto-detected on some graphics devices; if support is not detected, this option will fall back to fill_type = "segments" (in case of a false negative, fill_type = "gradient" can be set explicitly). On R < 4.2, support for gradients cannot be auto-detected, so this will always fall back to fill_type = "segments", in which case you can set fill_type = "gradient" explicitly if you are using a graphics device that support gradients.

p_limits

Probability limits (as a vector of size 2) used to determine the lower and upper limits of theoretical distributions (distributions from samples ignore this parameter and determine their limits based on the limits of the sample). E.g., if this is c(.001, .999), then a slab is drawn for the distribution from the quantile at p = .001 to the quantile at p = .999. If the lower (respectively upper) limit is NA, then the lower (upper) limit will be the minimum (maximum) of the distribution's support if it is finite, and 0.001 (0.999) if it is not finite. E.g., if p_limits is c(NA, NA), on a gamma distribution the effective value of p_limits would be c(0, .999) since the gamma distribution is defined on ⁠(0, Inf)⁠; whereas on a normal distribution it would be equivalent to c(.001, .999) since the normal distribution is defined on ⁠(-Inf, Inf)⁠.

density

Density estimator for sample data. One of:

  • A function which takes a numeric vector and returns a list with elements x (giving grid points for the density estimator) and y (the corresponding densities). ggdist provides a family of functions following this format, including density_unbounded() and density_bounded(). This format is also compatible with stats::density().

  • A string giving the suffix of a function name that starts with "density_"; e.g. "bounded" for ⁠[density_bounded()]⁠, "unbounded" for ⁠[density_unbounded()]⁠, or "histogram" for density_histogram(). Defaults to "bounded", i.e. density_bounded(), which estimates the bounds from the data and then uses a bounded density estimator based on the reflection method.

adjust

Passed to density: the bandwidth for the density estimator for sample data is adjusted by multiplying it by this value. See e.g. density_bounded() for more information. Default (waiver()) defers to the default of the density estimator, which is usually 1.

trim

For sample data, should the density estimate be trimmed to the range of the data? Passed on to the density estimator; see the density parameter. Default TRUE.

expand

For sample data, should the slab be expanded to the limits of the scale? Default FALSE. Can be length two to control expansion to the lower and upper limit respectively.

breaks

Determines the breakpoints defining bins. Defaults to "Scott". Similar to (but not exactly the same as) the breaks argument to graphics::hist(). One of:

  • A scalar (length-1) numeric giving the number of bins

  • A vector numeric giving the breakpoints between histogram bins

  • A function taking x and weights and returning either the number of bins or a vector of breakpoints

  • A string giving the suffix of a function that starts with "breaks_". ggdist provides weighted implementations of the "Sturges", "Scott", and "FD" break-finding algorithms from graphics::hist(), as well as breaks_fixed() for manually setting the bin width. See breaks.

For example, breaks = "Sturges" will use the breaks_Sturges() algorithm, breaks = 9 will create 9 bins, and breaks = breaks_fixed(width = 1) will set the bin width to 1.

align

Determines how to align the breakpoints defining bins. Default ("none") performs no alignment. One of:

  • A scalar (length-1) numeric giving an offset that is subtracted from the breaks. The offset must be between 0 and the bin width.

  • A function taking a sorted vector of breaks (bin edges) and returning an offset to subtract from the breaks.

  • A string giving the suffix of a function that starts with "align_" used to determine the alignment, such as align_none(), align_boundary(), or align_center().

For example, align = "none" will provide no alignment, align = align_center(at = 0) will center a bin on 0, and align = align_boundary(at = 0) will align a bin edge on 0.

outline_bars

For sample data (if density is "histogram") and for discrete analytical distributions (whose slabs are drawn as histograms), determines if outlines in between the bars are drawn when the slab_color aesthetic is used. If FALSE (the default), the outline is drawn only along the tops of the bars; if TRUE, outlines in between bars are also drawn. See density_histogram().

point_interval

A function from the point_interval() family (e.g., median_qi, mean_qi, mode_hdi, etc), or a string giving the name of a function from that family (e.g., "median_qi", "mean_qi", "mode_hdi", etc; if a string, the caller's environment is searched for the function, followed by the ggdist environment). This function determines the point summary (typically mean, median, or mode) and interval type (quantile interval, qi; highest-density interval, hdi; or highest-density continuous interval, hdci). Output will be converted to the appropriate x- or y-based aesthetics depending on the value of orientation. See the point_interval() family of functions for more information.

slab_type

(deprecated) The type of slab function to calculate: probability density (or mass) function ("pdf"), cumulative distribution function ("cdf"), or complementary CDF ("ccdf"). Instead of using slab_type to change f and then mapping f onto an aesthetic, it is now recommended to simply map the corresponding computed variable (e.g. pdf, cdf, or 1 - cdf) directly onto the desired aesthetic.

limits

Manually-specified limits for the slab, as a vector of length two. These limits are combined with those computed based on p_limits as well as the limits defined by the scales of the plot to determine the limits used to draw the slab functions: these limits specify the maximal limits; i.e., if specified, the limits will not be wider than these (but may be narrower). Use NA to leave a limit alone; e.g. limits = c(0, NA) will ensure that the lower limit does not go below 0, but let the upper limit be determined by either p_limits or the scale settings.

n

Number of points at which to evaluate the function that defines the slab.

.width

The .width argument passed to point_interval: a vector of probabilities to use that determine the widths of the resulting intervals. If multiple probabilities are provided, multiple intervals per group are generated, each with a different probability interval (and value of the corresponding .width and level generated variables).

orientation

Whether this geom is drawn horizontally or vertically. One of:

  • NA (default): automatically detect the orientation based on how the aesthetics are assigned. Automatic detection works most of the time.

  • "horizontal" (or "y"): draw horizontally, using the y aesthetic to identify different groups. For each group, uses the x, xmin, xmax, and thickness aesthetics to draw points, intervals, and slabs.

  • "vertical" (or "x"): draw vertically, using the x aesthetic to identify different groups. For each group, uses the y, ymin, ymax, and thickness aesthetics to draw points, intervals, and slabs.

For compatibility with the base ggplot naming scheme for orientation, "x" can be used as an alias for "vertical" and "y" as an alias for "horizontal" (ggdist had an orientation parameter before base ggplot did, hence the discrepancy).

na.rm

If FALSE, the default, missing values are removed with a warning. If TRUE, missing values are silently removed.

show.legend

Should this layer be included in the legends? Default is c(size = FALSE), unlike most geoms, to match its common use cases. FALSE hides all legends, TRUE shows all legends, and NA shows only those that are mapped (the default for most geoms).

inherit.aes

If FALSE, overrides the default aesthetics, rather than combining with them. This is most useful for helper functions that define both data and aesthetics and shouldn't inherit behaviour from the default plot specification, e.g. borders().

Details

To visualize sample data, such as a data distribution, samples from a bootstrap distribution, or a Bayesian posterior, you can supply samples to the x or y aesthetic.

To visualize analytical distributions, you can use the xdist or ydist aesthetic. For historical reasons, you can also use dist to specify the distribution, though this is not recommended as it does not work as well with orientation detection. These aesthetics can be used as follows:

Value

A ggplot2::Stat representing a gradient + interval geometry which can be added to a ggplot() object.

Computed Variables

The following variables are computed by this stat and made available for use in aesthetic specifications (aes()) using the after_stat() function or the after_stat argument of stage():

Aesthetics

The slab+interval stats and geoms have a wide variety of aesthetics that control the appearance of their three sub-geometries: the slab, the point, and the interval.

These stats support the following aesthetics:

In addition, in their default configuration (paired with geom_slabinterval()) the following aesthetics are supported by the underlying geom:

Slab-specific aesthetics

Interval-specific aesthetics

Point-specific aesthetics

Color aesthetics

Line aesthetics

Slab-specific color and line override aesthetics

Interval-specific color and line override aesthetics

Point-specific color and line override aesthetics

Deprecated aesthetics

Other aesthetics (these work as in standard geoms)

See examples of some of these aesthetics in action in vignette("slabinterval"). Learn more about the sub-geom override aesthetics (like interval_color) in the scales documentation. Learn more about basic ggplot aesthetics in vignette("ggplot2-specs").

See Also

See geom_slabinterval() for the geom underlying this stat. See stat_slabinterval() for the stat this shortcut is based on.

Other slabinterval stats: stat_ccdfinterval(), stat_cdfinterval(), stat_eye(), stat_halfeye(), stat_histinterval(), stat_interval(), stat_pointinterval(), stat_slab(), stat_spike()

Examples

library(dplyr)
library(ggplot2)
library(distributional)

theme_set(theme_ggdist())

# ON SAMPLE DATA
set.seed(1234)
df = data.frame(
  group = c("a", "b", "c"),
  value = rnorm(1500, mean = c(5, 7, 9), sd = c(1, 1.5, 1))
)
df %>%
  ggplot(aes(x = value, y = group)) +
  stat_gradientinterval()

# ON ANALYTICAL DISTRIBUTIONS
dist_df = data.frame(
  group = c("a", "b", "c"),
  mean =  c(  5,   7,   8),
  sd =    c(  1, 1.5,   1)
)
# Vectorized distribution types, like distributional::dist_normal()
# and posterior::rvar(), can be used with the `xdist` / `ydist` aesthetics
dist_df %>%
  ggplot(aes(y = group, xdist = dist_normal(mean, sd))) +
  stat_gradientinterval()

[Package ggdist version 3.3.2 Index]