bv.boxplot {asbio} R Documentation

## Bivariate boxplots

### Description

Creates diagnostic bivariate quelplot ellipses (bivariate boxplots) using the method of Goldberg and Iglewicz (1992). The output can be used to check assumptions of bivariate normality and to identify multivariate outliers. The default robust=TRUE option relies on on a biweight correlation estimator function written by Everitt (2006). Quelplots, are potentially asymmetric, although the method currently employed here uses a single "fence" definition and creates symmetric ellipses.

### Usage

bv.boxplot(X, Y, robust = TRUE, D = 7, xlab = "X", ylab="Y", pch = 21,
pch.out = NULL, bg = "gray", bg.out = NULL, hinge.col = 1, fence.col = 1,
hinge.lty = 2, fence.lty = 3, xlim = NULL, ylim = NULL, names = 1:length(X),
ID.out = FALSE, cex.ID.out = 0.7, uni.CI = FALSE, uni.conf = 0.95,
uni.CI.col = 1, uni.CI.lty = 1, uni.CI.lwd = 2, show.points = TRUE, ...)


### Arguments

 X First of two quantitative variables making up the bivariate distribution. Y Second of two quantitative variables making up the bivariate distribution. robust Logical. Robust estimators, i.e. robust = TRUE are recommended. D The default D = 7 lets the fence be equal to a 99 percent confidence interval for an individual observation. xlab Caption for X axis. ylab Caption for Y axis. pch Plotting character(s) for scatterplot. pch.out Plotting character for outliers. hinge.col Hinge color. fence.col Fence color. hinge.lty Hinge line type. fence.lty Fence line type. xlim A two element vector defining the X-limits of the plot. ylim The Y-limits of the plot. bg Background color for points in scatterplot, defaults to black if pch is not in the range 21:26. bg.out Background color for outlying points in scatterplot, defaults to black if pch is not in the range 21:26. names An optional vector of names for X, Y coordinates. ID.out Logical. Whether or not outlying points should be given labels (from argument name in plot. cex.ID.out Character expansion for outlying ID labels. uni.CI Logical. If true, univariate confidence intervals for the true median at confidence uni.CI are shown. uni.conf Univariate confidence, only used if CI.uni = TRUE. uni.CI.col Univariate confidence bound line color, only used if CI.uni = TRUE. uni.CI.lty Univariate confidence bound line type, only used if CI.uni = TRUE. uni.CI.lwd Univariate confidence bound line width, only used if CI.uni = TRUE. show.points Logical. Whether points should be shown in graph. ... Additional arguments from points.

### Details

Two ellipses are drawn. The inner is the "hinge" which contains 50 percent of the data. The outer is the "fence". Observations outside of the "fence" constitute possible troublesome outliers. The function bivariate from Everitt (2004) is used to calculate robust biweight measures of correlation, scale, and location if robust = TRUE (the default). We have the following form to the quelplot model:

E_i = √{\frac{X^2_{si} + Y^2_{si} - 2R^*X_{si}Y_{si}}{1-R^{*2}}}.

where X_{si} = (X_i - T^*_X)/S^*_X, and Y_{si} = (Y_i - T^*_X)/S^*_Y are standardized values for X_i and Y_i, respectively, T^*_X and T^*_Y are location estimators for X and Y, S^*_X and S^*_Y are scale estimators for X and Y, and R^* is a correlation estimator for X and Y. We have:

E_m = median\{E_i:i=1,2,...,n\},

and

E_{max} = max\{E_i: E_i^2 < DE^2_m\}.

where D is a constant that regulates the distance of the "fence" and "hinge".

To draw the "hinge" we have:

R_1 = E_m√{\frac{1 + R^*}{2}},

R_2 = E_m√{\frac{1 - R^*}{2}}.

To draw the "fence" we have:

R_1 = E_{max}√{\frac{1 + R^*}{2}},

R_2 = E_{max}√{\frac{1 - R^*}{2}}.

For θ = 0 to 360, let:

Θ_1 = R_1cos(θ),

Θ_2 = R_2sin(θ).

The Cartesian coordinates of the "hinge" and "fence" are:

X=T^*_X=(Θ_1+Θ_2)S^*_X,

Y=T^*_Y=(Θ_1-Θ_2)S^*_Y.

Quelplots, are potentially asymmetric, although the current (and only) method used here defines a single value for E_{max} and hence creates symmetric ellipses. Under this implementation at least one point will define E_{max}, and lie on the "fence".

### Value

A diagnostic plot is returned. Invisible objects from the function include location, scale and correlation estimates for X and Y, estimates for E_m and E_{max}, and a list of outliers (that exceed E_{max}).

### Author(s)

Ken Aho, the function relies on an Everitt (2006) function for robust M-estimation.

### References

Everitt, B. (2006) An R and S-plus Companion to Multivariate Analysis. Springer.

Goldberg, K. M., and B. Ingelwicz (1992) Bivariate extensions of the boxplot. Technometrics 34: 307-320.

boxplot

### Examples

Y1<-rnorm(100, 17, 3)
Y2<-rnorm(100, 13, 2)
bv.boxplot(Y1, Y2)

X <- c(-0.24, 2.53, -0.3, -0.26, 0.021, 0.81, -0.85, -0.95, 1.0, 0.89, 0.59,
0.61, -1.79, 0.60, -0.05, 0.39, -0.94, -0.89, -0.37, 0.18)
Y <- c(-0.83, -1.44, 0.33, -0.41, -1.0, 0.53, -0.72, 0.33,  0.27, -0.99, 0.15,
-1.17, -0.61, 0.37, -0.96, 0.21, -1.29, 1.40, -0.21, 0.39)
b <- bv.boxplot(X, Y, ID.out = TRUE, bg.out = "red")
b


[Package asbio version 1.7 Index]