pp_plot {esvis} | R Documentation |
Produces the paired probability plot for two groups
Description
The paired probability plot maps the probability of obtaining a specific
score for each of two groups. The area under the curve
(auc
) corresponds to the probability that a randomly
selected observation from the x-axis group will have a higher score than
a randomly selected observation from the y-axis group. This function
extends the basic pp-plot by allowing multiple curves and faceting to
facilitate a variety of comparisons. Note that because the plotting is
built on top of ggplot2, additional customization can
be made on top of the plots, as illustrated in the examples.
Usage
pp_plot(
data,
formula,
ref_group = NULL,
cuts = NULL,
cut_labels = TRUE,
cut_label_x = 0.02,
cut_label_size = 3,
lines = TRUE,
linetype = "solid",
linewidth = 1.1,
shade = TRUE,
shade_alpha = 0.2,
refline = TRUE,
refline_col = "gray40",
refline_type = "dashed",
refline_width = 1.1
)
Arguments
data |
The data frame to be plotted |
formula |
A formula of the type |
ref_group |
Optional character vector (of length 1) naming the reference group. Defaults to the group with the highest mean score. |
cuts |
Integer. Optional vector (or single number) of scores used to annotate the plot. If supplied, line segments will extend from the corresponding x and y axes and meet at the PP curve. |
cut_labels |
Logical. Should the reference lines corresponding to
|
cut_label_x |
The x-axis location of the cut labels. Defaults to 0.02. |
cut_label_size |
The size of the cut labels. Defaults to 3. |
lines |
Logical. Should the PP Lines be plotted? Defaults to
|
linetype |
The linetype for the PP lines. Defaults to "solid". |
linewidth |
The width of the PP lines. Defaults to 1.1 (just marginally larger than the default ggplot2 lines). |
shade |
Logical. Should the area under the curve be shaded? Defaults to
|
shade_alpha |
Transparency of the shading. Defaults to 0.2. |
refline |
Logical. Should a diagonal reference line be plotted,
representing the value at which no difference is observed between the
reference and focal distributions? Defaults to |
refline_col |
Color of the reference line. Defaults to a dark gray. |
refline_type |
The linetype for the reference line. Defaults to "dashed". |
refline_width |
The width of the reference line. Defaults to 1, or just slightly thinner than the PP lines. |
Value
A ggplot2 object displaying the specified PP plot.
Examples
# PP plot examining differences by condition
pp_plot(star, math ~ condition)
# The sample size gets very small in the above within cells (e.g., wild
# changes within the "other" group in particular). Overall, the effect doesn't
# seem to change much by condition.
# Look at something a little more interesting
## Not run:
pp_plot(benchmarks, math ~ ell + season + frl)
## End(Not run)
# Add some cut scores
pp_plot(benchmarks, math ~ ell, cuts = c(190, 210, 215))
## Make another interesting plot. Use ggplot to customize
## Not run:
library(tidyr)
library(ggplot2)
benchmarks %>%
gather(subject, score, reading, math) %>%
pp_plot(score ~ ell + subject + season,
ref_group = "Non-ELL") +
scale_fill_brewer(name = "ELL Status", palette = "Pastel2") +
scale_color_brewer(name = "ELL Status", palette = "Pastel2") +
labs(title = "Differences among English Language Learning Groups",
subtitle = "Note crossing of reference line") +
theme_minimal()
## End(Not run)