R: Computing the highest density regions of a 1D density

get_hdr_1d {ggdensity}

R Documentation

Computing the highest density regions of a 1D density

Description

get_hdr_1d is used to estimate a 1-dimensional density and compute corresponding HDRs. The estimated density and HDRs are represented in a discrete form as a grid, defined by arguments range and n. get_hdr_1d is used internally by layer functions stat_hdr_rug() and stat_hdr_rug_fun().

Usage

get_hdr_1d(
  x = NULL,
  method = "kde",
  probs = c(0.99, 0.95, 0.8, 0.5),
  n = 512,
  range = NULL,
  hdr_membership = TRUE,
  fun,
  args = list()
)

Arguments

`x`	A vector of data
`method`	Either a character (`"kde"`, `"norm"`, `"histogram"`, `"freqpoly"`, or `"fun"`) or `⁠method_*_1d()⁠` function. See the "The `method` argument" section below for details.
`probs`	Probabilities to compute HDRs for.
`n`	Resolution of grid representing estimated density and HDRs.
`range`	Range of grid representing estimated density and HDRs.
`hdr_membership`	Should HDR membership of data points (`x`) be computed?
`fun`	Optional, a probability density function, must be vectorized in its first argument. See the "The `fun` argument" section below for details.
`args`	Optional, a list of arguments to be provided to `fun`.

Value

get_hdr_1d returns a list with elements df_est (data.frame), breaks (named numeric), and data (data.frame).

df_est: the estimated HDRs and density evaluated on the grid defined by range and n. The column of estimated HDRs (df_est$hdr) is a numeric vector with values from probs. The columns df_est$fhat and df_est$fhat_discretized correspond to the estimated density on the original scale and rescaled to sum to 1, respectively.
breaks: the heights of the estimated density (df_est$fhat) corresponding to the HDRs specified by probs. Will always have additional element Inf representing the cutoff for the 100% HDR.
data: the original data provided in the data argument. If hdr_membership is set to TRUE, this includes a column (data$hdr_membership) with the HDR corresponding to each data point.

The `method` argument

The density estimator used to estimate the HDRs is specified with the method argument. The simplest way to specify an estimator is to provide a character value to method, for example method = "kde" specifies a kernel density estimator. However, this specification is limited to the default behavior of the estimator.

Instead, it is possible to provide a function call, for example: method = method_kde_1d(). This is slightly different from the function calls provided in get_hdr(), note the ⁠_1d⁠ suffix. In many cases, these functions accept parameters governing the density estimation procedure. Here, method_kde_1d() accepts several parameters related to the choice of kernel. For details, see ?method_kde_1d. Every method of univariate density estimation implemented has such corresponding ⁠method_*_1d()⁠ function, each with an associated help page.

Note: geom_hdr_rug() and other layer functions also have method arguments which behave in the same way. For more details on the use and implementation of the ⁠method_*_1d()⁠ functions, see vignette("method", "ggdensity").

The `fun` argument

If method is set to "fun", get_hdr_1d() expects a univariate probability density function to be specified with the fun argument. It is required that fun be a function of at least one argument (x). Beyond this first argument, fun can have arbitrarily many arguments; these can be set in get_hdr_1d() as a named list via the args parameter.

Note: get_hdr_1d() requires that fun be vectorized in x. For an example of an appropriate choice of fun, see the final example below.

Examples

x <- rnorm(1e3)

# Two ways to specify `method`
get_hdr_1d(x, method = "kde")
get_hdr_1d(x, method = method_kde_1d())

## Not run: 

# If parenthesis are omitted, `get_hdr_1d()` errors
get_hdr_1d(df, method = method_kde_1d)

# If the `_1d` suffix is omitted, `get_hdr_1d()` errors
get_hdr_1d(x, method = method_kde())

## End(Not run)

# Adjust estimator parameters with arguments to `method_kde_1d()`
get_hdr_1d(x, method = method_kde_1d(kernel = "triangular"))

# Estimate different HDRs with `probs`
get_hdr_1d(x, method = method_kde_1d(), probs = c(.975, .6, .2))

# Compute "population" HDRs of specified univariate pdf with `method = "fun"`
f <- function(x, sd = 1) dnorm(x, sd = sd)
get_hdr_1d(method = "fun", fun = f, range = c(-5, 5))
get_hdr_1d(method = "fun", fun = f, range = c(-5, 5), args = list(sd = .5))

[Package ggdensity version 1.0.0 Index]