pp_conf_plot {qqconf}R Documentation

PP Plot with Simultaneous and Pointwise Testing Bounds.

Description

Create a pp-plot with with a shaded simultaneous acceptance region and, optionally, lines for a point-wise region. The observed values are plotted against their expected values had they come from the specified distribution.

Usage

pp_conf_plot(
  obs,
  distribution = pnorm,
  method = c("ell", "ks"),
  alpha = 0.05,
  difference = FALSE,
  log10 = FALSE,
  right_tail = FALSE,
  add = FALSE,
  dparams = list(),
  bounds_params = list(),
  line_params = list(),
  plot_pointwise = FALSE,
  pointwise_lines_params = list(),
  points_params = list(),
  polygon_params = list(border = NA, col = "gray"),
  prob_pts_method = c("uniform", "median", "normal"),
  ...
)

Arguments

obs

The observed data.

distribution

The probability function for the specified distribution. Defaults to pnorm. Custom distributions are allowed as long as all parameters are supplied in dparams.

method

Method for simultaneous testing bands. Must be either "ell" (equal local levels test), which applies a level \eta pointwise test to each order statistic such that the Type I error of the global test is alpha, or "ks" to apply a Kolmogorov-Smirnov test. "ell" is recommended.

alpha

Type I error of global test of whether the data come from the reference distribution.

difference

Whether to plot the difference between the observed and expected values on the vertical axis.

log10

Whether to plot axes on -log10 scale (e.g. to see small p-values).

right_tail

This argument is only used if log10 is TRUE. When TRUE, the x-axis is -log10(1 - Expected Probability) and the y-axis is -log10(1 - Observed Probability). When FALSE (default) the x-axis is -log10(Expected Probability) and the y-axis is -log10(Observed Probability). The argument should be set to TRUE to make observations in the right tail of the distribution easier to see, and set to false to make the observations in the left tail of the distribution easier to see.

add

Whether to add points to an existing plot.

dparams

List of additional arguments for the probability function of the distribution (e.g. df=1). Note that if any parameters of the distribution are specified, parameter estimation will not be performed on the unspecified parameters, and instead they will take on the default values set by the distribution function. For the uniform distribution, parameter estimation is not performed, and the default parameters are max = 1 and min = 0. For other distributions parameters will be estimated if not provided. For the normal distribution, we estimate the mean as the median and the standard deviation as Sn from the paper by Rousseeuw and Croux 1993 "Alternatives to the Median Absolute Deviation". For all other distributions besides uniform and normal, the code uses MLE to estimate the parameters. Note that estimation is not implemented for custom distributions, so all parameters of the distribution must be provided by the user.

bounds_params

List of optional arguments for get_bounds_two_sided. (i.e. tol, max_it, method).

line_params

arguments passed to the line function to modify the line that indicates a perfect fit of the reference distribution.

plot_pointwise

Boolean indicating whether pointwise bounds should be added to the plot

pointwise_lines_params

arguments passed to the lines function that modifies pointwise bounds when plot_pointwise is set to TRUE.

points_params

arguments to be passed to the points function to plot the data.

polygon_params

Arguments to be passed to the polygon function to construct simultaneous confidence region. By default border is set to NA and col is set to grey.

prob_pts_method

(optional) method used to get probability points for plotting. The default value, "uniform", results in ppoints(n, a=0), which are the expected values of the order statistics of Uniform(0, 1). When this argument is set to "median", qbeta(.5, c(1:n), c(n:1)), the medians of the order statistics of Uniform(0, 1) will be used. For a PP plot, there is no particular theoretical justification for setting this argument to "normal", which results in ppoints(n), but it is an option because it is used in some other packages. When alpha is large, "median" is recommended.

...

Additional arguments passed to the plot function.

Details

If any of the points of the pp-plot fall outside the simultaneous acceptance region for the selected level alpha test, that means that we can reject the null hypothesis that the data are i.i.d. draws from the specified distribution. If difference is set to TRUE, the vertical axis plots the observed probability minus expected probability. If pointwise bounds are used, then on average, alpha * n of the points will fall outside the bounds under the null hypothesis, so the chance that the pp-plot has any points falling outside of the pointwise bounds is typically much higher than alpha under the null hypothesis. For this reason, a simultaneous region is preferred.

Value

None, PP plot is produced.

References

Weine, E., McPeek, MS., & Abney, M. (2023). Application of Equal Local Levels to Improve Q-Q Plot Testing Bands with R Package qqconf Journal of Statistical Software, 106(10). https://doi:10.18637/jss.v106.i10

Examples

set.seed(0)
smp <- rnorm(100)

# Plot PP plot against normal distribution with mean and variance estimated
pp_conf_plot(
  obs=smp
)

# Make same plot on -log10 scale to highlight the left tail,
# with radius of plot circles also reduced by .5
pp_conf_plot(
  obs=smp,
  log10 = TRUE,
  points_params = list(cex = .5)
)

# Make same plot with difference between observed and expected values on the y-axis
pp_conf_plot(
  obs=smp,
  difference = TRUE
)

# Make same plot with samples plotted as a blue line, expected value line plotted as a red line,
# and pointwise bounds plotted as black lines
pp_conf_plot(
  obs=smp,
  plot_pointwise = TRUE,
  points_params = list(col="blue", type="l"),
  line_params = list(col="red")
)


[Package qqconf version 1.3.2 Index]