qq_conf_plot {qqconf}R Documentation

QQ Plot with Simultaneous and Pointwise Testing Bounds.

Description

Create a qq-plot with with a shaded simultaneous acceptance region and, optionally, lines for a point-wise region. The observed values are plotted against their expected values had they come from the specified distribution.

Usage

qq_conf_plot(
  obs,
  distribution = qnorm,
  method = c("ell", "ks"),
  alpha = 0.05,
  difference = FALSE,
  log10 = FALSE,
  right_tail = FALSE,
  add = FALSE,
  dparams = list(),
  bounds_params = list(),
  line_params = list(),
  plot_pointwise = FALSE,
  pointwise_lines_params = list(),
  points_params = list(),
  polygon_params = list(border = NA, col = "gray"),
  prob_pts_method = c("best_available", "normal", "uniform", "median"),
  ...
)

Arguments

obs

The observed data.

distribution

The quantile function for the specified distribution. Defaults to qnorm. Custom distributions are allowed as long as all parameters are supplied in dparams.

method

Method for simultaneous testing bands. Must be either "ell" (equal local levels test), which applies a level \eta pointwise test to each order statistic such that the Type I error of the global test is alpha, or "ks" to apply a Kolmogorov-Smirnov test. "ell" is recommended.

alpha

Type I error of global test of whether the data come from the reference distribution.

difference

Whether to plot the difference between the observed and expected values on the vertical axis.

log10

Whether to plot axes on -log10 scale (e.g. to see small p-values). Can only be used for strictly positive distributions.

right_tail

This argument is only used if log10 is TRUE. When TRUE, the x-axis is -log10(1 - Expected Quantile) and the y-axis is -log10(1 - Observed Quantile). When FALSE (default) the x-axis is -log10(Expected Quantile) and the y-axis is -log10(Observed Quantile). The argument should be set to TRUE only when the support of the distribution lies in (0, 1), and one wants to make observations in the right tail of the distribution easier to see. The argument should be set to FALSE when one wants to make observations in the left tail of the distribution easier to see.

add

Whether to add points to an existing plot.

dparams

List of additional arguments for the quantile function of the distribution (e.g. df=1). Note that if any parameters of the distribution are specified, parameter estimation will not be performed on the unspecified parameters, and instead they will take on the default values set by the distribution function. For the uniform distribution, parameter estimation is not performed, and the default parameters are max = 1 and min = 0. For other distributions parameters will be estimated if not provided. For the normal distribution, we estimate the mean as the median and the standard deviation as Sn from the paper by Rousseeuw and Croux 1993 "Alternatives to the Median Absolute Deviation". For all other distributions besides uniform and normal, the code uses MLE to estimate the parameters. Note that estimation is not implemented for custom distributions, so all parameters of the distribution must be provided by the user.

bounds_params

List of optional arguments for get_bounds_two_sided (i.e. tol, max_it, method).

line_params

Arguments passed to the lines function to modify the line that indicates a perfect fit of the reference distribution.

plot_pointwise

Boolean indicating whether pointwise bounds should be added to the plot

pointwise_lines_params

Arguments passed to the lines function that modifies pointwise bounds when plot_pointwise is set to TRUE.

points_params

Arguments to be passed to the points function to plot the data.

polygon_params

Arguments to be passed to the polygon function to construct simultaneous confidence region. By default border is set to NA and col is set to grey.

prob_pts_method

(optional) method used to get probability points for plotting. The quantile function will be applied to these points to get the expected values. When this argument is set to "normal" (recommended for a normal QQ plot) ppoints(n) will be used, which is what most other plotting software uses. When this argument is set to "uniform" (recommended for a uniform QQ plot) ppoints(n, a=0), which are the expected values of the order statistics of Uniform(0, 1), will be used. Finally, when this argument is set to "median" (recommended for all other distributions) qbeta(.5, c(1:n), c(n:1)) will be used. Under the default setting, "best_available", the probability points as recommended above will be used. Note that "median" is suitable for all distributions and is particularly recommended when alpha is large.

...

Additional arguments passed to the plot function.

Details

If any of the points of the qq-plot fall outside the simultaneous acceptance region for the selected level alpha test, that means that we can reject the null hypothesis that the data are i.i.d. draws from the specified distribution. If difference is set to TRUE, the vertical axis plots the observed quantile minus expected quantile. If pointwise bounds are used, then on average, alpha * n of the points will fall outside the bounds under the null hypothesis, so the chance that the qq-plot has any points falling outside of the pointwise bounds is typically much higher than alpha under the null hypothesis. For this reason, a simultaneous region is preferred.

Value

None, QQ plot is produced.

References

Weine, E., McPeek, MS., & Abney, M. (2023). Application of Equal Local Levels to Improve Q-Q Plot Testing Bands with R Package qqconf Journal of Statistical Software, 106(10). https://doi:10.18637/jss.v106.i10

Examples

set.seed(0)
smp <- runif(100)

# Plot QQ plot against uniform(0, 1) distribution
qq_conf_plot(
  obs=smp,
  distribution = qunif
)

# Make same plot on -log10 scale to highlight small p-values,
# with radius of plot circles also reduced by .5
qq_conf_plot(
  obs=smp,
  distribution = qunif,
  points_params = list(cex = .5),
  log10 = TRUE
)

# Make same plot with difference between observed and expected values on the y-axis
qq_conf_plot(
  obs=smp,
  distribution = qunif,
  difference = TRUE
)

# Make same plot with sample plotted as a blue line, expected value line plotted as a red line,
# and with pointwise bounds plotted as black lines
qq_conf_plot(
  obs=smp,
  distribution = qunif,
  plot_pointwise = TRUE,
  points_params = list(col="blue", type="l"),
  line_params = list(col="red")
)


[Package qqconf version 1.3.2 Index]