compBagplot {mrfDepth} | R Documentation |
Computations for drawing a bagplot
Description
Computes all elements of the bagplot, a generalisation
of the univariate boxplot to bivariate data. The bagplot can be
computed based on halfspace depth, projection depth, skewness-adjusted projection
depth and directional projection depth. To draw the actual plot, the function
bagplot
needs to be called on the result of compBagplot
.
Usage
compBagplot(x, type = "hdepth", sizesubset = 500,
extra.directions = FALSE, options = NULL)
Arguments
x |
An |
type |
Determines the depth function used to construct the bagplot:
|
sizesubset |
When computing the bagplot based on halfspace depth,
the size of the subset used to perform the main
computations. See Details for more information. |
extra.directions |
Logical indicating whether additional directions should
be considered in the computation of the fence for the
bagplot based on projection depth or skewness-adjusted
projection depth. If set to |
options |
A list of options to pass to the
|
Details
The bagplot has been proposed by Rousseeuw et al. (1999) as a generalisation of the boxplot to bivariate data. It is constructed based on halfspace depth. In the original format the deepest point is indicated by a "+" and is contained in the bag which is defined as the depth region containing the 50% observations with largest depth. The fence is obtained by inflating the bag (relative to the deepest point) by a factor of three. The loop is the convex hull of the observations of x
inside the fence. Observations outside the fence are flagged as outliers and plotted with a red star. This function only computes all the components constituting the bagplot. The bagplot itself can be drawn using the bagplot
function.
The bagplot may also be defined using other depth functions. When using projection depth, skewness-adjusted projection depth or directional projection depth, the bagplot is build as follows. The center corresponds to the observation with largest depth. The bag is constructed as the convex hull of the fifty percent points with largest depth. Outliers are identified as points with a depth smaller than a cutoff value, see projdepth
, sprojdepth
and dprojdepth
for the precise definition.
The loop is computed as the convex hull of the non-outlying points. The fence is approximated by the convex hull of those points that lie on rays from the center through the vertices of the bag and have a depth that equals the cutoff depth. For a better approximation the user can set the input parameter extraDirections
to TRUE
such that an additional 250 equally spaced directions on the circle are considered.
The computation of the bagplot based on halfspace depth can be time
consuming. Therefore it is possible to limit the bulk of the computations
to a random subset of the data. Computations of the halfspace median and
the bag are then based on this random subset. The number of points in this
subset can be controlled by the optional argument sizesubset
.
It is first checked whether the data is found to lie on a line. If so, the routine will give a warning, giving back the dimension of the subspace (being 1) together with the normal vector to that line.
Value
A list with components:
center |
Center of the data. |
chull |
When |
bag |
The coordinates of the vertices of the bag. |
fence |
The coordinates of the vertices of the fence. |
datatype |
An |
flag |
A vector of length |
depth |
The depth of the observations of |
dimension |
If the data are lying in a lower dimensional subspace, the dimension of this subspace. |
hyperplane |
If the data are lying in a lower dimensional subspace, a direction orthogonal to this subspace. |
type |
Same as the input parameter |
Author(s)
P. Segaert based on Fortran code by P.J. Rousseeuw, I. Ruts and A. Struyf.
References
Rousseeuw P.J., Ruts I., Tukey J.W. (1999). The bagplot: A bivariate boxplot. The American Statistician, 53, 382–387.
Hubert M., Van der Veeken S. (2008). Outlier detection for skewed data. Journal of Chemometrics, 22, 235–246.
Hubert M., Rousseeuw P.J., Segaert, P. (2015). Rejoinder to 'Multivariate functional outlier detection'. Statistical Methods & Applications, 24, 269–277.
See Also
bagplot
, hdepth
, projdepth
, sprojdepth
,
dprojdepth
.
Examples
data(bloodfat)
# Result <- compBagplot(bloodfat)
# bagplot(Result)
# The sizesubset argument may be used to control the
# computation time when computing the bagplot based on
# halfspace depth. However results may be unreliable when
# choosing a small subset for the main computations.
# system.time(Result1 <- compBagplot(bloodfat))
# system.time(Result2 <- compBagplot(bloodfat, sizesubset = 100))
# bagplot(Result1)
# bagplot(Result2)
# When using any of the projection depth functions,
# a list of options may be passed down to the corresponding
# outlyingness routines.
options <- list(type = "Rotation",
ndir = 50,
stand = "unimcd",
h = floor(nrow(bloodfat)*3/4))
Result <- compBagplot(bloodfat,
type = "projdepth", options = options)
bagplot(Result)
# The fence is computed using the depthContour function.
# To get a smoother fence, one may opt to consider extra
# directions.
options <- list(ndir = 500,
seed = 36)
Result <- compBagplot(bloodfat,
type = "dprojdepth", options = options)
bagplot(Result, plot.fence = TRUE)
options <- list(ndir = 500,
seed = 36)
Result <- compBagplot(bloodfat,
type = "dprojdepth", options = options,
extra.directions = TRUE)
bagplot(Result, plot.fence = TRUE)