binarize.numeric {bsnsing}R Documentation

Create Binary Features based on a Numeric Vector

Description

Discretize a continuous variable x by splitting its range at a sequence of cutpoints. The cutpoints are determined so as to effectively split the binary target y. This function is used internally by binarize.

Usage

binarize.numeric(
  x,
  name,
  y,
  target = stop("Must provide a target, 0 or 1"),
  segments = 10,
  bin.size = 5,
  node.size = 10
)

Arguments

x

a numeric vector.

name

a character string, the variable name of x.

y

a numeric or integer vector of the same length as x, consisting of two unique values: 0 and 1.

target

a scalar, valued 0 or 1, indicating the target level of y.

segments

a positive integer, any value below 3 is set to 3. It is the maximum number of segments the range of x is divided into.

bin.size

a positive integer. It is the minimum number of observations required to fall into each bin.

node.size

a positive integer. If either child node is smaller than the node.size, do not return the perfect rule.

Value

a data frame with binary (0 and 1) entries, or a character string describing the rule that perfectly splits y. If a data frame is returned, the column names are indicative of the conditions used to form the corresponding columns.


[Package bsnsing version 1.0.1 Index]