cdfCompare {EnvStats} R Documentation

## Plot Two Cumulative Distribution Functions

### Description

For one sample, plots the empirical cumulative distribution function (ecdf) along with a theoretical cumulative distribution function (cdf). For two samples, plots the two ecdf's. These plots are used to graphically assess goodness of fit.

### Usage

  cdfCompare(x, y = NULL, discrete = FALSE,
prob.method = ifelse(discrete, "emp.probs", "plot.pos"), plot.pos.con = NULL,
distribution = "norm", param.list = NULL,
estimate.params = is.null(param.list), est.arg.list = NULL,
x.col = "blue", y.or.fitted.col = "black",
x.lwd = 3 * par("cex"), y.or.fitted.lwd = 3 * par("cex"),

### Details

When both x and y are supplied, the function cdfCompare creates the empirical cdf plot of x and y on the same plot by calling the function ecdfPlot.

When y is not supplied, the function cdfCompare creates the emprical cdf plot of x (by calling ecdfPlot) and the theoretical cdf plot (by calling cdfPlot and using the argument distribution) on the same plot.

### Value

When y is supplied, cdfCompare invisibly returns a list with components:

 x.ecdf.list a list with components Order.Statistics and Cumulative.Probabilities, giving coordinates of the points that have been plotted for the x values. y.ecdf.list a list with components Order.Statistics and Cumulative.Probabilities, giving coordinates of the points that have been plotted for the y values.

When y is not supplied, cdfCompare invisibly returns a list with components:

 x.ecdf.list a list with components Order.Statistics and Cumulative.Probabilities, giving coordinates of the points that have been plotted for the x values. fitted.cdf.list a list with components Quantiles and Cumulative.Probabilities, giving coordinates of the points that have been plotted for the fitted cdf.

### Note

An empirical cumulative distribution function (ecdf) plot is a graphical tool that can be used in conjunction with other graphical tools such as histograms, strip charts, and boxplots to assess the characteristics of a set of data. It is easy to determine quartiles and the minimum and maximum values from such a plot. Also, ecdf plots allow you to assess local density: a higher density of observations occurs where the slope is steep.

Chambers et al. (1983, pp.11-16) plot the observed order statistics on the y-axis vs. the ecdf on the x-axis and call this a quantile plot.

Empirical cumulative distribution function (ecdf) plots are often plotted with theoretical cdf plots (see cdfPlot and cdfCompare) to graphically assess whether a sample of observations comes from a particular distribution. The Kolmogorov-Smirnov goodness-of-fit test (see gofTest) is the statistical companion of this kind of comparison; it is based on the maximum vertical distance between the empirical cdf plot and the theoretical cdf plot. More often, however, quantile-quantile (Q-Q) plots are used instead of ecdf plots to graphically assess departures from an assumed distribution (see qqPlot).

### Author(s)

Steven P. Millard (EnvStats@ProbStatInfo.com)

### References

Chambers, J.M., W.S. Cleveland, B. Kleiner, and P.A. Tukey. (1983). Graphical Methods for Data Analysis. Duxbury Press, Boston, MA, pp.11-16.

Cleveland, W.S. (1993). Visualizing Data. Hobart Press, Summit, New Jersey, 360pp.

D'Agostino, R.B. (1986a). Graphical Analysis. In: D'Agostino, R.B., and M.A. Stephens, eds. Goodness-of Fit Techniques. Marcel Dekker, New York, Chapter 2, pp.7-62.

cdfPlot, ecdfPlot, qqPlot.

### Examples

  # Generate 20 observations from a normal (Gaussian) distribution
# with mean=10 and sd=2 and compare the empirical cdf with a
# theoretical normal cdf that is based on estimating the parameters.
# (Note: the call to set.seed simply allows you to reproduce this example.)

set.seed(250)
x <- rnorm(20, mean = 10, sd = 2)
dev.new()
cdfCompare(x)

#----------

# Generate 30 observations from an exponential distribution with parameter
# rate=0.1 (see the R help file for Exponential) and compare the empirical
# cdf with the empirical cdf of the normal observations generated in the
# previous example:

set.seed(432)
y <- rexp(30, rate = 0.1)
dev.new()
cdfCompare(x, y)

#==========

# Generate 20 observations from a Poisson distribution with parameter lambda=10
# (see the R help file for Poisson) and compare the empirical cdf with a
# theoretical Poisson cdf based on estimating the distribution parameters.
# (Note: the call to set.seed simply allows you to reproduce this example.)

set.seed(250)
x <- rpois(20, lambda = 10)
dev.new()
cdfCompare(x, dist = "pois")

#==========

# Clean up
#---------
rm(x, y)
graphics.off()


[Package EnvStats version 2.8.1 Index]