qqPlotGestalt {EnvStats}R Documentation

Develop Gestalt of Q-Q Plots for Specific Distributions

Description

Produce a series of quantile-quantile (Q-Q) plots (also called probability plots) or Tukey mean-difference Q-Q plots for a user-specified distribution.

Usage

  qqPlotGestalt(distribution = "norm", param.list = list(mean = 0, sd = 1), 
    estimate.params = FALSE, est.arg.list = NULL, sample.size = 10, num.pages = 2, 
    num.plots.per.page = 4, nrow = ceiling(num.plots.per.page/2), plot.type = "Q-Q", 
    plot.pos.con = switch(dist.abb, norm = , lnorm = , lnormAlt = , lnorm3 = 0.375, 
      evd = 0.44, 0.4), equal.axes = (qq.line.type == "0-1" || estimate.params), 
    margin.title = NULL, add.line = FALSE, qq.line.type = "least squares", 
    duplicate.points.method = "standard", points.col = 1, line.col = 1, 
    line.lwd = par("cex"), line.lty = 1, digits = .Options$digits, 
    same.window = TRUE, ask = same.window & num.pages > 1, 
    mfrow = c(nrow, num.plots.per.page/nrow), 
    mar = c(4, 4, 1, 1) + 0.1, oma = c(0, 0, 7, 0), mgp = c(2, 0.5, 0), ..., 
    main = NULL, xlab = NULL, ylab = NULL, xlim = NULL, ylim = NULL)

Arguments

distribution

a character string denoting the distribution abbreviation. The default value is distribution="norm". See the help file for Distribution.df for a list of possible distribution abbreviations. This argument is ignored if y is supplied.

param.list

a list with values for the parameters of the distribution. The default value is param.list=list(mean=0, sd=1). See the help file for Distribution.df for the names and possible values of the parameters associated with each distribution. This argument is ignored if estimate.params=TRUE.

estimate.params

a logical scalar indicating whether to compute quantiles based on estimating the distribution parameters (estimate.params=TRUE) or using the known distribution parameters specified in param.list (estimate.params=FALSE, the default). The default value of estimate.params is FALSE because the default configuration is to generate random numbers from a standard normal (mean=0, sd=1) distribution and produce a standard normal Q-Q plot.

est.arg.list

a list whose components are optional arguments associated with the function used to estimate the parameters of the assumed distribution (see the help file Estimating Distribution Parameters). For example, all functions used to estimate distribution parameters have an optional argument called method that specifies the method to use to estimate the parameters. (See the help file for Distribution.df for a list of available estimation methods for each distribution.) To override the default estimation method, supply the argument
est.arg.list with a component called method; for example
est.arg.list=list(method="mle"). The default value is est.arg.list=NULL so that all default values for the estimating function are used. This argument is ignored if estimate.params=FALSE.

sample.size

numeric scalar indicating the number of observations to generate for each Q-Q plot. The default value is sample.size=10.

num.pages

numeric scalar indicating the number of pages of plots to generate. The default value is num.pages=2.

num.plots.per.page

numeric scalar indicating the number of plots per page. The default value is num.pages=4.

nrow

numeric scalar indicating the number of rows of plots on each page. The default value is the smallest integer greater than or equal to num.plots.per.page/2.

plot.type

a character string denoting the kind of plot. Possible values are "Q-Q" (Quantile-Quantile plot, the default) and "Tukey Mean-Difference Q-Q" (Tukey mean-difference Q-Q plot). This argument may be abbreviated (e.g., plot.type="T" to indicate a Tukey mean-difference Q-Q plot).

plot.pos.con

numeric scalar between 0 and 1 containing the value of the plotting position constant. The default value of plot.pos.con depends on the value of the argument distribution. For the normal, lognormal, three-parameter lognormal, zero-modified normal, and zero-modified lognormal distributions, the default value is plot.pos.con=0.375. For the Type I extreme value (Gumbel) distribution (distribution="evd"), the default value is plot.pos.con=0.44. For all other distributions, the default value is plot.pos.con=0.4. See the help file for qqPlot for the motivation behind these values for plotting positions.

equal.axes

logical scalar indicating whether to use the same range on the x- and y-axes when plot.type="Q-Q". The default value is TRUE if qq.line.type="0-1" or estimate.params=TRUE, otherwise it is FALSE. This argument is ignored if plot.type="Tukey Mean-Difference Q-Q".

margin.title

character string indicating the title printed in the top margin on each page of plots. The default value indicates the kind of Q-Q plot, the probability distribution, the sample size, and the estimation method used (if any).

add.line

logical scalar indicating whether to add a line to the plot. If add.line=TRUE and plot.type="Q-Q", a line determined by the value of qq.line.type is added to the plot. If add.line=TRUE and
plot.type="Tukey Mean-Difference Q-Q", a horizontal line at y=0 is added to the plot. The default value is add.line=FALSE.

qq.line.type

character string determining what kind of line to add to the Q-Q plot. Possible values are "least squares" (the default), "0-1" and "robust". For the value "least squares", a least squares line is fit and added. For the value "0-1", a line with intercept 0 and slope 1 is added. For the value "robust", a line is fit through the first and third quartiles of the x and y data. This argument is ignored if add.line=FALSE or plot.type="Tukey Mean-Difference Q-Q".

duplicate.points.method

character string denoting how to plot points with duplicate (x,y) values. Possible values are "standard" (the default), "jitter", and "number". For the value "standard", a single plotting symbol is plotted (this is the default behavior of R). For the value "jitter", a separate plotting symbol is plotted for each duplicate point, where the plotting symbols cluster around the true value of x and y. For the value "number", a single number is plotted at (x,y) that represents how many duplicate points are at that (x,y) coordinate.

points.col

numeric scalar or character string determining the color of the points in the plot. The default value is points.col=1. See the entry for col in the help file for par for more information.

line.col

numeric scalar or character string determining the color of the line in the plot. The default value is points.col=1. See the entry for col in the help file for par for more information. This argument is ignored if add.line=FALSE.

line.lwd

numeric scalar determining the width of the line in the plot. The default value is line.lwd=par("cex"). See the entry for lwd in the help file for par for more information. This argument is ignored if add.line=FALSE.

line.lty

a numeric scalar determining the line type of the line in the plot. The default value is line.lty=1. See the entry for lty in the help file for par for more information. This argument is ignored if add.line=FALSE.

digits

a scalar indicating how many significant digits to print for the distribution parameters. The default value is digits=.Options$digits.

same.window

logical scalar indicating whether to produce all plots in the same graphics window (same.window=TRUE; the default), or to create a new graphics window for each separate plot (same.window=FALSE).

ask

logical scalar supplied to the function devAskNewPage, indicating whether to prompt the user before creating a new plot within a single graphics window. The default value is TRUE if same.window=TRUE and num.pages > 1, otherwise it is FALSE.

mfrow, mar, oma, mgp, main, xlab, ylab, xlim, ylim, ...

additional graphical parameters (see par).

Details

The function qqPlotGestalt allows the user to display several Q-Q plots or Tukey mean-difference Q-Q plots for a specified probability distribution. The distribution is specified with the arguments distribution and param.list. By default, normal (Gaussian) Q-Q plots are produced.

If estimate.params=FALSE (the default), the theoretical quantiles on the x-axis are computed using the known distribution parameters specified in param.list. If estimate.params=TRUE, the distribution parameters are estimated based on the sample, and these estimated parameters are then used to compute the theoretical quantiles. For distributions that can be specified by a location and scale parameter (e.g., Normal, Logistic, extreme value, etc.), the value of estimate.params will not affect the general shape of the plot, only the values recorded on the x-axis. For distributions that cannot be specified by a location and scale parameter (e.g., exponential, gamma, etc.), it is recommended that estimate.params be set to TRUE since in pracitice the values of the distribution parameters are not known but must be estimated from the sample.

The purpose of qqPlotGestalt is to allow the user to build-up a visual memory of “typical” Q-Q plots. A Q-Q plot is a graphical tool that allows you to assess how well a particular set of observations fit a particular probability distribution. The value of this tool depends on the user having an internal reference set of Q-Q plots with which to compare the current Q-Q plot.

See the help file for qqPlot for more information.

Value

The NULL value is returned.

Author(s)

Steven P. Millard (EnvStats@ProbStatInfo.com)

References

See the REFERENCES section for qqPlot.

See Also

qqPlot.

Examples

  # Look at eight typical normal (Gaussian) Q-Q plots for random samples 
  # of size 10 from a N(0,1) distribution 
  # Are you surprised by the variability in the plots?
  #
  # (Note:  you must use set.seed if you want to reproduce the exact 
  #         same plots more than once.)

  set.seed(298)
  qqPlotGestalt(same.window = FALSE)

  # Add lines to these same Q-Q plots
  #----------------------------------
  set.seed(298)
  qqPlotGestalt(same.window = FALSE, add.line = TRUE)

  # Add lines to different Q-Q plots
  #---------------------------------
  qqPlotGestalt(same.window = FALSE, add.line = TRUE)

  ## Not run: 
  # Look at 4 sets of plots all in the same graphics window
  #--------------------------------------------------------
  qqPlotGestalt(add.line = TRUE, num.pages = 4)
  
## End(Not run)

  #==========

  # Look at Q-Q plots for a gamma distribution
  #-------------------------------------------

  qqPlotGestalt(dist = "gammaAlt", 
    param.list = list(mean = 10, cv = 1), 
    estimate.params = TRUE, num.pages = 3, 
    same.window = FALSE, add.line = TRUE)


  # Look at Tukey Mean Difference Q-Q plots 
  # for a gamma distribution
  #----------------------------------------

  qqPlotGestalt(dist = "gammaAlt", 
    param.list = list(mean = 10, cv = 1), 
    estimate.params = TRUE, num.pages = 3, 
    plot.type = "Tukey", same.window = FALSE, add.line = TRUE)

  #==========

  # Clean up
  #---------
  graphics.off()

[Package EnvStats version 2.8.1 Index]