catto_label {cattonum}R Documentation

Label encoding

Description

Label encoding

Usage

catto_label(train, ..., test, ordering = "increasing", verbose = TRUE)

Arguments

train

The training data, in a data.frame or tibble.

...

The columns to be encoded. If none are specified, then all character and factor columns are encoded.

test

The test data, in a data.frame or tibble.

ordering

How should labels be assigned to levels? There are three different ways to pass this argument. First, a length one character vector with value "increasing", "decreasing", "observed", or "random" will apply that ordering to each column being encoded. Second, a character vector of length greater than one may be passed, specifying one of the above four options for each column being encoded. Finally, a list may be passed specifying a user-defined ordering for each column being encoded.

verbose

Should informative messages be printed? Defaults to TRUE (not yet used).

Value

The encoded dataset in a cattonum_df if no test dataset was provided, and the encoded datasets in a cattonum_df2 otherwise.

Examples

catto_label(iris)

y <- 2^(0:5)
x1 <- c("a", "b", NA, "b", "a", "a")
x2 <- c("c", "c", "c", "d", "d", "c")
df_fact <- data.frame(y, x1, x2)

catto_label(df_fact,
  ordering = list(c("b", "a"), c("c", "d"))
)

catto_label(df_fact, ordering = c("increasing", "decreasing"))

[Package cattonum version 0.0.5 Index]