R: Plot a transformed empirical cdf

ecdfHT {ecdfHT}

R Documentation

Plot a transformed empirical cdf

Description

Produces a basic plot showing a transformed empirical cdf for heavy tailed data. It uses a log-log transform on the tails, which shows power law decay as linear.

Usage

ecdfHT(x, scale.q = c(0.25, 0.5, 0.75), show.axes.labels = TRUE,
  show.plot = TRUE, type = "p", ...)

Arguments

`x`	A vector of data
`scale.q`	A vector of 3 probabilites; specifies the quantiles of the data to use for the left tail, mid region, and right tail
`show.axes.labels`	Boolean value: indicates whether default labels are plotted or not. (Use function `ecdfHT.axes` to add custom labels)
`show.plot`	Boolean value: show plot or only do calculations
`type`	Type of plot, passed to plot. Use type='p' for points, type='l' for lines
`...`	Optional graphical parameters, e.g. col='red'

Details

Most of the work is done by ecdfHT.draw and the associated helper functions.

Assuming no repeats in x, ecdf = (standard ecdf - (1/2))/n, like type=5 in the R function quantile. So instead of taking values 1/n, 2/n, 3/n, ... , k/n, ..., 1 it takes values 1/(2n), 3/(2n), ..., (2k-1)/(2n), ..., (2n-1)/(2n). This avoids 0 at lower endpoint and 1 at upper endpoint, which causes problems when we extend tails with a power law. (If there are m repeated x values, then the corresponding jump in the ecdf at that point is m/n instead of 1/n.)

The default values scale.q=c(.25,.5,.75) splits the data into quartiles; picking different quantiles splits the data into 4 different groups: the lowest group is the left tail, i.e. all values less than the quantile corresponding to scale.q[1]; the next group is between the left tail and center = quantile scale.q[2]); the third group is the center and quantile scale.q[3]; the last group is the upper tail. For two-sided data, it makes sense to use something like (p,0.5,1-p) for scale.q, where p is choosen to determine where the tail regions begin.

For one-sided data, it makes sense to use scale.q=c(0,0,p). In this case, the first two groups are empty and the effect is to divide the data into two groups: a moderate/lower range and a right tail. See the example below with nonnegative data.

The transformations h(x) acts on these different regions. It is linear on the middle two regions and logarithmic on the tails. The transformation g(p) acts on the corresponding values of the ecdf described above. The basic plot shows (h(x[i]),g(ecdf[i])): the first component is a monotonic transform of the x values, the second component is a monotonic transform of the ecdf. See the accompanying vignette for exact definitions: go to the package index and click on User guides, package vignettes and other documentation.

Value

An object of class 'ecdfHT.transform' which gives the information necessary to draw the plot and later add other curves and labels. This list is returned invisibly and contains the following fields:

scale.q: vector of length 3, copied from the input argument
scale.x: vector of length 3, the quantiles from the data corresponding to scale.q
xsort: vector of the sorted, unique data values
ecdf: nonstandard empirical cdf, see details
xx: transformed x values: xx[i]=h(xsort[i])
yy: transformed ecdf values: yy[i]=g(ecdf[i])

Examples

x <- rcauchy( 1000 )
ecdfHT( x )
title("basic ecdfHT plot")

xabs <- abs(x)
ecdfHT( xabs, scale.q=c(0,0,.75) )
title("one sided data")

[Package ecdfHT version 0.1.1 Index]