R: Produces the paired probability plot for two groups

pp_plot {esvis}

R Documentation

Produces the paired probability plot for two groups

Description

The paired probability plot maps the probability of obtaining a specific score for each of two groups. The area under the curve (auc) corresponds to the probability that a randomly selected observation from the x-axis group will have a higher score than a randomly selected observation from the y-axis group. This function extends the basic pp-plot by allowing multiple curves and faceting to facilitate a variety of comparisons. Note that because the plotting is built on top of ggplot2, additional customization can be made on top of the plots, as illustrated in the examples.

Usage

pp_plot(
  data,
  formula,
  ref_group = NULL,
  cuts = NULL,
  cut_labels = TRUE,
  cut_label_x = 0.02,
  cut_label_size = 3,
  lines = TRUE,
  linetype = "solid",
  linewidth = 1.1,
  shade = TRUE,
  shade_alpha = 0.2,
  refline = TRUE,
  refline_col = "gray40",
  refline_type = "dashed",
  refline_width = 1.1
)

Arguments

`data`	The data frame to be plotted
`formula`	A formula of the type `out ~ group` where `out` is the outcome variable and `group` is the grouping variable. Note this variable can include any arbitrary number of groups. Additional variables can be included with `+` to produce separate plots by the secondary or tertiary variable of interest (e.g., `out ~ group + characteristic1 + characteristic2`). No more than two additional characteristics can be supplied at this time.
`ref_group`	Optional character vector (of length 1) naming the reference group. Defaults to the group with the highest mean score.
`cuts`	Integer. Optional vector (or single number) of scores used to annotate the plot. If supplied, line segments will extend from the corresponding x and y axes and meet at the PP curve.
`cut_labels`	Logical. Should the reference lines corresponding to `cuts` be labeled? Defaults to `TRUE`.
`cut_label_x`	The x-axis location of the cut labels. Defaults to 0.02.
`cut_label_size`	The size of the cut labels. Defaults to 3.
`lines`	Logical. Should the PP Lines be plotted? Defaults to `TRUE`.
`linetype`	The linetype for the PP lines. Defaults to "solid".
`linewidth`	The width of the PP lines. Defaults to 1.1 (just marginally larger than the default ggplot2 lines).
`shade`	Logical. Should the area under the curve be shaded? Defaults to `TRUE`.
`shade_alpha`	Transparency of the shading. Defaults to 0.2.
`refline`	Logical. Should a diagonal reference line be plotted, representing the value at which no difference is observed between the reference and focal distributions? Defaults to `TRUE`.
`refline_col`	Color of the reference line. Defaults to a dark gray.
`refline_type`	The linetype for the reference line. Defaults to "dashed".
`refline_width`	The width of the reference line. Defaults to 1, or just slightly thinner than the PP lines.

Value

A ggplot2 object displaying the specified PP plot.

Examples

# PP plot examining differences by condition
pp_plot(star, math ~ condition)

# The sample size gets very small in the above within cells (e.g., wild 
# changes within the "other" group in particular). Overall, the effect doesn't
# seem to change much by condition.

# Look at something a little more interesting
## Not run: 
pp_plot(benchmarks, math ~ ell + season + frl)

## End(Not run)
# Add some cut scores
pp_plot(benchmarks, math ~ ell, cuts = c(190, 210, 215))

## Make another interesting plot. Use ggplot to customize
## Not run: 
library(tidyr)
library(ggplot2)
benchmarks %>% 
  gather(subject, score, reading, math) %>% 
  pp_plot(score ~ ell + subject + season,
          ref_group = "Non-ELL") +
  scale_fill_brewer(name = "ELL Status", palette = "Pastel2") +
  scale_color_brewer(name = "ELL Status", palette = "Pastel2") +
  labs(title = "Differences among English Language Learning Groups",
       subtitle = "Note crossing of reference line") +
  theme_minimal()

## End(Not run)

[Package esvis version 0.3.1 Index]