stat_spike {ggdist} | R Documentation |
Spike plot (ggplot2 stat)
Description
Stat for drawing "spikes" (optionally with points on them) at specific points
on a distribution (numerical or determined as a function of the distribution),
intended for annotating stat_slabinterval()
geometries.
Usage
stat_spike(
mapping = NULL,
data = NULL,
geom = "spike",
position = "identity",
...,
at = "median",
p_limits = c(NA, NA),
density = "bounded",
adjust = waiver(),
trim = TRUE,
expand = FALSE,
breaks = waiver(),
align = "none",
outline_bars = FALSE,
slab_type = NULL,
limits = NULL,
n = 501,
orientation = NA,
na.rm = FALSE,
show.legend = NA,
inherit.aes = TRUE
)
Arguments
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
Use to override the default connection between |
position |
Position adjustment, either as a string, or the result of a call to a position adjustment function.
Setting this equal to |
... |
Other arguments passed to
|
at |
The points at which to evaluate the PDF and CDF of the distribution. One of:
The values of |
p_limits |
Probability limits (as a vector of size 2) used to determine the lower and upper
limits of theoretical distributions (distributions from samples ignore this parameter and determine
their limits based on the limits of the sample). E.g., if this is |
density |
Density estimator for sample data. One of:
|
adjust |
Passed to |
trim |
For sample data, should the density estimate be trimmed to the range of the
data? Passed on to the density estimator; see the |
expand |
For sample data, should the slab be expanded to the limits of the scale? Default |
breaks |
Determines the breakpoints defining bins. Defaults to
For example, |
align |
Determines how to align the breakpoints defining bins. Default
(
For example, |
outline_bars |
For sample data (if |
slab_type |
(deprecated) The type of slab function to calculate: probability density (or mass) function
( |
limits |
Manually-specified limits for the slab, as a vector of length two. These limits are combined with those
computed based on |
n |
Number of points at which to evaluate the function that defines the slab. |
orientation |
Whether this geom is drawn horizontally or vertically. One of:
For compatibility with the base ggplot naming scheme for |
na.rm |
If |
show.legend |
Should this layer be included in the legends? Default is |
inherit.aes |
If |
Details
This stat computes slab values (i.e. PDF and CDF values) at specified locations
on a distribution, as determined by the at
parameter.
To visualize sample data, such as a data distribution, samples from a
bootstrap distribution, or a Bayesian posterior, you can supply samples to
the x
or y
aesthetic.
To visualize analytical distributions, you can use the xdist
or ydist
aesthetic. For historical reasons, you can also use dist
to specify the distribution, though
this is not recommended as it does not work as well with orientation detection.
These aesthetics can be used as follows:
-
xdist
,ydist
, anddist
can be any distribution object from the distributional package (dist_normal()
,dist_beta()
, etc) or can be aposterior::rvar()
object. Since these functions are vectorized, other columns can be passed directly to them in anaes()
specification; e.g.aes(dist = dist_normal(mu, sigma))
will work ifmu
andsigma
are columns in the input data frame. -
dist
can be a character vector giving the distribution name. Then thearg1
, ...arg9
aesthetics (orargs
as a list column) specify distribution arguments. Distribution names should correspond to R functions that have"p"
,"q"
, and"d"
functions; e.g."norm"
is a valid distribution name because R defines thepnorm()
,qnorm()
, anddnorm()
functions for Normal distributions.See the
parse_dist()
function for a useful way to generatedist
andargs
values from human-readable distribution specs (like"normal(0,1)"
). Such specs are also produced by other packages (like thebrms::get_prior
function in brms); thus,parse_dist()
combined with the stats described here can help you visualize the output of those functions.
Value
A ggplot2::Stat representing a spike geometry which can be added to a ggplot()
object.
Aesthetics
The spike geom
has a wide variety of aesthetics that control
the appearance of its two sub-geometries: the spike and the point.
These stat
s support the following aesthetics:
x
: x position of the geometry (when orientation ="vertical"
); or sample data to be summarized (whenorientation = "horizontal"
with sample data).y
: y position of the geometry (when orientation ="horizontal"
); or sample data to be summarized (whenorientation = "vertical"
with sample data).weight
: When using samples (i.e. thex
andy
aesthetics, notxdist
orydist
), optional weights to be applied to each draw.xdist
: When using analytical distributions, distribution to map on the x axis: a distributional object (e.g.dist_normal()
) or aposterior::rvar()
object.ydist
: When using analytical distributions, distribution to map on the y axis: a distributional object (e.g.dist_normal()
) or aposterior::rvar()
object.dist
: When using analytical distributions, a name of a distribution (e.g."norm"
), a distributional object (e.g.dist_normal()
), or aposterior::rvar()
object. See Details.args
: Distribution arguments (args
orarg1
, ...arg9
). See Details.
In addition, in their default configuration (paired with geom_spike()
)
the following aesthetics are supported by the underlying geom:
Spike-specific (aka Slab-specific) aesthetics
thickness
: The thickness of the slab at eachx
value (iforientation = "horizontal"
) ory
value (iforientation = "vertical"
) of the slab.side
: Which side to place the slab on."topright"
,"top"
, and"right"
are synonyms which cause the slab to be drawn on the top or the right depending on iforientation
is"horizontal"
or"vertical"
."bottomleft"
,"bottom"
, and"left"
are synonyms which cause the slab to be drawn on the bottom or the left depending on iforientation
is"horizontal"
or"vertical"
."topleft"
causes the slab to be drawn on the top or the left, and"bottomright"
causes the slab to be drawn on the bottom or the right."both"
draws the slab mirrored on both sides (as in a violin plot).scale
: What proportion of the region allocated to this geom to use to draw the slab. Ifscale = 1
, slabs that use the maximum range will just touch each other. Default is0.9
to leave some space between adjacent slabs. For a comprehensive discussion and examples of slab scaling and normalization, see thethickness
scale article.
Color aesthetics
colour
: (orcolor
) The color of the spike and point sub-geometries.fill
: The fill color of the point sub-geometry.alpha
: The opacity of the spike and point sub-geometries.colour_ramp
: (orcolor_ramp
) A secondary scale that modifies thecolor
scale to "ramp" to another color. Seescale_colour_ramp()
for examples.fill_ramp
: A secondary scale that modifies thefill
scale to "ramp" to another color. Seescale_fill_ramp()
for examples.
Line aesthetics
linewidth
: Width of the line used to draw the spike sub-geometry.size
: Size of the point sub-geometry.stroke
: Width of the outline around the point sub-geometry.linetype
: Type of line (e.g.,"solid"
,"dashed"
, etc) used to draw the spike.
Other aesthetics (these work as in standard geom
s)
width
height
group
See examples of some of these aesthetics in action in vignette("slabinterval")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
Computed Variables
The following variables are computed by this stat and made available for
use in aesthetic specifications (aes()
) using the after_stat()
function or the after_stat
argument of stage()
:
-
x
ory
: For slabs, the input values to the slab function. For intervals, the point summary from the interval function. Whether it isx
ory
depends onorientation
-
xmin
orymin
: For intervals, the lower end of the interval from the interval function. -
xmax
orymax
: For intervals, the upper end of the interval from the interval function. -
.width
: For intervals, the interval width as a numeric value in[0, 1]
. For slabs, the width of the smallest interval containing that value of the slab. -
level
: For intervals, the interval width as an ordered factor. For slabs, the level of the smallest interval containing that value of the slab. -
pdf
: For slabs, the probability density function (PDF). Ifoptions("ggdist.experimental.slab_data_in_intervals")
isTRUE
: For intervals, the PDF at the point summary; intervals also havepdf_min
andpdf_max
for the PDF at the lower and upper ends of the interval. -
cdf
: For slabs, the cumulative distribution function. Ifoptions("ggdist.experimental.slab_data_in_intervals")
isTRUE
: For intervals, the CDF at the point summary; intervals also havecdf_min
andcdf_max
for the CDF at the lower and upper ends of the interval. -
n
: For slabs, the number of data points summarized into that slab. If the slab was created from an analytical distribution via thexdist
,ydist
, ordist
aesthetic,n
will beInf
. -
f
: (deprecated) For slabs, the output values from the slab function (such as the PDF, CDF, or CCDF), determined byslab_type
. Instead of usingslab_type
to changef
and then mappingf
onto an aesthetic, it is now recommended to simply map the corresponding computed variable (e.g.pdf
,cdf
, or1 - cdf
) directly onto the desired aesthetic. -
at
: For spikes, a character vector of names of the functions or expressions used to determine the points at which the slab functions were evaluated to create spikes. Values of this computed variable are determined by theat
parameter; see its description above.
See Also
See geom_spike()
for the geom underlying this stat.
See stat_slabinterval()
for the stat this shortcut is based on.
Other slabinterval stats:
stat_ccdfinterval()
,
stat_cdfinterval()
,
stat_eye()
,
stat_gradientinterval()
,
stat_halfeye()
,
stat_histinterval()
,
stat_interval()
,
stat_pointinterval()
,
stat_slab()
Examples
library(ggplot2)
library(distributional)
library(dplyr)
df = tibble(
d = c(dist_normal(1), dist_gamma(2,2)), g = c("a", "b")
)
# annotate the density at the mode of a distribution
df %>%
ggplot(aes(y = g, xdist = d)) +
stat_slab(aes(xdist = d)) +
stat_spike(at = "Mode") +
# need shared thickness scale so that stat_slab and geom_spike line up
scale_thickness_shared()
# annotate the endpoints of intervals of a distribution
# here we'll use an arrow instead of a point by setting size = 0
arrow_spec = arrow(angle = 45, type = "closed", length = unit(4, "pt"))
df %>%
ggplot(aes(y = g, xdist = d)) +
stat_halfeye(point_interval = mode_hdci) +
stat_spike(
at = function(x) hdci(x, .width = .66),
size = 0, arrow = arrow_spec, color = "blue", linewidth = 0.75
) +
scale_thickness_shared()
# annotate quantiles of a sample
set.seed(1234)
data.frame(x = rnorm(1000, 1:2), g = c("a","b")) %>%
ggplot(aes(x, g)) +
stat_slab() +
stat_spike(at = function(x) quantile(x, ppoints(10))) +
scale_thickness_shared()