LVboxplot {lvplot} | R Documentation |
Side-by-side LV boxplots with base graphics
Description
An extension of standard boxplots which draws k letter statistics. Conventional boxplots (Tukey 1977) are useful displays for conveying rough information about the central 50% of the data and the extent of the data.
Usage
LVboxplot(x, ...)
## S3 method for class 'formula'
LVboxplot(
formula,
alpha = 0.95,
k = NULL,
perc = NULL,
horizontal = TRUE,
xlab = NULL,
ylab = NULL,
col = "grey30",
bg = "grey90",
width = 0.9,
width.method = "linear",
median.col = "grey10",
...
)
## S3 method for class 'numeric'
LVboxplot(
x,
alpha = 0.95,
k = NULL,
perc = NULL,
horizontal = TRUE,
xlab = NULL,
ylab = NULL,
col = "grey30",
bg = "grey90",
width = 0.9,
width.method = "linear",
median.col = "grey10",
...
)
Arguments
x |
numeric vector of data |
... |
passed onto |
formula |
a plotting formula of the form |
alpha |
if supplied, depth k is calculated such that (1- |
k |
number of letter value statistics used |
perc |
if supplied, depth k is adjusted such that |
horizontal |
display horizontally (TRUE) or vertically (FALSE) |
xlab |
x axis label |
ylab |
y axis label |
col |
vector of colours to use |
bg |
background colour |
width |
maximum height/width of box |
width.method |
one of 'linear', 'height' or 'area'. Methods 'height' and 'area' ensure that these dimension are proportional to the number of observations within each box. |
median.col |
colour of the line for the median |
Details
For moderate-sized data sets (n < 1000
), detailed estimates of tail
behavior beyond the quartiles may not be trustworthy, so the information
provided by boxplots is appropriately somewhat vague beyond the quartiles,
and the expected number of “outliers” and “far-out” values for a
Gaussian sample of size n
is often less than 10 (Hoaglin, Iglewicz,
and Tukey 1986). Large data sets (n \approx 10,000-100,000
) afford
more precise estimates of quantiles in the tails beyond the quartiles and
also can be expected to present a large number of “outliers” (about
0.4 + 0.007 n
).
The letter-value box plot addresses both these shortcomings: it conveys
more detailed information in the tails using letter values, only out to the
depths where the letter values are reliable estimates of their
corresponding quantiles (corresponding to tail areas of roughly
2^{-i}
); “outliers” are defined as a function of the most extreme
letter value shown. All aspects shown on the letter-value boxplot are
actual observations, thus remaining faithful to the principles that
governed Tukey's original boxplot.
Examples
n <- 10
oldpar <- par()
par(mfrow=c(4,2), mar=c(3,3,3,3))
for (i in 1:4) {
x <- rexp(10 ^ (i + 1))
boxplot(x, col = "grey", horizontal = TRUE)
title(paste("Exponential, n = ", length(x)))
LVboxplot(x, col = "grey", xlab = "")
}
par(mfrow=oldpar$mfrow, mar=oldpar$mar)
with(ontime, LVboxplot(sqrt(TaxiIn + TaxiOut) ~ UniqueCarrier, horizontal=FALSE))