hypervolume_gaussian {hypervolume} | R Documentation |
Hypervolume construction via Gaussian kernel density estimation
Description
Constructs a hypervolume by building a Gaussian kernel density estimate on an adaptive grid of random points wrapping around the original data points. The bandwidth vector reflects the axis-aligned standard deviations of a hyperelliptical kernel.
Because Gaussian kernel density estimates do not decay to zero in a finite distance, the algorithm evaluates the kernel density in hyperelliptical regions out to a distance set by sd.count
.
After delineating the probability density, the function calls hypervolume_threshold
to determine a boundary. The defaullt behavior ensures that 95 percent of the stimated probability density is enclosed by the chosen boundary. However note that theaccuracy of the total probability density depends on having set a large value of sd.count
.
Most use cases should not require modification of any parameters except kde.bandwidth
.
Optionally, weighting of the data (e.g. for abundance-weighting) is possible. By default, the function estimates the probability density of the observations via Gaussian kernel functions, assuming each data point contributes equally. By setting a weight
parameter, the algorithm can instead take a weighted average the kernel functions centered on each observation. Code for weighting data written by Yuanzhi Li (Yuanzhi.Li@usherbrooke.ca).
Usage
hypervolume_gaussian(data, name = NULL,
weight = NULL,
samples.per.point = ceiling((10^(3 + sqrt(ncol(data))))/nrow(data)),
kde.bandwidth = estimate_bandwidth(data),
sd.count = 3,
quantile.requested = 0.95,
quantile.requested.type = "probability",
chunk.size = 1000,
verbose = TRUE,
...)
Arguments
data |
A m x n matrix or data frame, where m is the number of observations and n is the dimensionality. |
name |
A string to assign to the hypervolume for later output and plotting. Defaults to the name of the variable if NULL. |
weight |
An optional vector of weights for the kernel density estimation. Defaults to even weighting ( |
samples.per.point |
Number of random points to be evaluated per data point in |
kde.bandwidth |
A bandwidth vector obtained by running |
sd.count |
The number of standard deviations (converted to actual units by multiplying by |
quantile.requested |
The quantile value used to delineate the boundary of the kernel density estimate. See |
quantile.requested.type |
The type of quantile (volume or probability) used for the boundary delineation. See |
chunk.size |
Number of random points to process per internal step. Larger values may have better performance on machines with large amounts of free memory. Changing this parameter does not change the output of the function; only how this output is internally assembled. |
verbose |
Logical value; print diagnostic output if |
... |
Other arguments to pass to |
Value
A Hypervolume-class
object corresponding to the inferred hypervolume.
See Also
Examples
data(penguins,package='palmerpenguins')
penguins_no_na = as.data.frame(na.omit(penguins))
penguins_adelie = penguins_no_na[penguins_no_na$species=="Adelie",
c("bill_length_mm","bill_depth_mm","flipper_length_mm")]
# low samples per point for CRAN demo
hv = hypervolume_gaussian(penguins_adelie,name='Adelie',samples.per.point=100)
summary(hv)