discretizeVariableToRanges {mmb}R Documentation

Discretize a continuous random variable to ranges/buckets.

Description

Discretizes a continuous random variable into buckets (ranges). Each range is delimited by an exclusive minimum value and an inclusive maximum value.

Usage

discretizeVariableToRanges(
  data,
  openEndRanges = TRUE,
  numRanges = NA,
  exclMinVal = NULL,
  inclMaxVal = NULL
)

Arguments

data

a vector with numeric data

openEndRanges

boolean default True. If true, then the minimum value of the first range will be set to @seealso .Machine$double.xmin and the maximum value of the last range will be set to @seealso .Machine$double.xmax, so that all values get covered.

numRanges

integer default NA. If NULL, then the amount of ranges (buckets) depends on the amount of data given. A minimum of two buckets is used then, and a maximum of ceiling(log2(length(data))).

exclMinVal

numeric default NULL. Used to delimit the lower bound of the given data. If not given, then no value is excluded, as the exclusive lower bound becomes the minimum of the given data minus an epsilon of 1e-15.

inclMaxVal

numeric default NULL. Used to delimit the upper bound of the given data. If not given, then the upper inclusive bound is the max of the given data.

Value

List a List of vectors, where each vector has two values, the first being the exclusive minimum value of the range, and the second being the inclusive maximum value of the range. The list will be as long as the number of buckets requested.

Author(s)

Sebastian Hönel sebastian.honel@lnu.se

Examples

buckets <- mmb::discretizeVariableToRanges(
  data = iris$Sepal.Length, openEndRanges = TRUE)

length(buckets)
buckets[[5]]

[Package mmb version 0.13.3 Index]