R: Univariate Grid

univariate_grid {hstats}

R Documentation

Univariate Grid

Description

Creates evaluation grid for any numeric or non-numeric vector z.

For discrete z (non-numeric, or numeric with at most grid_size unique values), this is simply sort(unique(z)).

Otherwise, if strategy = "uniform" (default), the evaluation points form a regular grid over the trimmed range of z. By trimmed range we mean the range of z after removing values outside trim[1] and trim[2] quantiles. Set trim = 0:1 for no trimming.

If strategy = "quantile", the evaluation points are quantiles over a regular grid of probabilities from trim[1] to trim[2].

Quantiles are calculated via the inverse of the ECDF, i.e., via ⁠stats::quantile(..., type = 1⁠).

Usage

univariate_grid(
  z,
  grid_size = 49L,
  trim = c(0.01, 0.99),
  strategy = c("uniform", "quantile"),
  na.rm = TRUE
)

Arguments

`z`	A vector or factor.
`grid_size`	Approximate grid size.
`trim`	The default `c(0.01, 0.99)` means that values outside the 1% and 99% quantiles of non-discrete numeric columns are removed before calculation of grid values. Set to `0:1` for no trimming.
`strategy`	How to find grid values of non-discrete numeric columns? Either "uniform" or "quantile", see description of `univariate_grid()`.
`na.rm`	Should missing values be dropped from the grid? Default is `TRUE`.

Value

A vector or factor of evaluation points.

Examples

univariate_grid(iris$Species)
univariate_grid(rev(iris$Species))                       # Same

x <- iris$Sepal.Width
univariate_grid(x, grid_size = 5)                        # Uniform binning
univariate_grid(x, grid_size = 5, strategy = "quantile")  # Quantile