R: Build target encoding

build_target_encoding {dataPreparation}

R Documentation

Build target encoding

Description

Target encoding is the process of replacing a categorical value with the aggregation of the target variable. build_target_encoding is used to compute aggregations.

Usage

build_target_encoding(
  data_set,
  cols_to_encode,
  target_col,
  functions = "mean",
  verbose = TRUE
)

Arguments

`data_set`	Matrix, data.frame or data.table
`cols_to_encode`	columns to aggregate according to (list)
`target_col`	column to aggregate (character)
`functions`	functions of aggregation (list or character, default to "mean"). Functions `compute_probability_ratio` and `compute_weight_of_evidence` are classically used functions
`verbose`	Should the algorithm talk? (Logical, default to TRUE)

Value

A list of data.table a data.table for each cols_to_encode each data.table containing a line by unique value of column and len(functions) + 1 columns.

Examples

# Build a data set
require(data.table)
data_set <- data.table(student = c("Marie", "Marie", "Pierre", "Louis", "Louis"),
                      grades = c(1, 1, 2, 3, 4))

# Perform target_encoding construction
build_target_encoding(data_set, cols_to_encode = "student", target_col = "grades",
                      functions = c("mean", "sum"))

[Package dataPreparation version 1.1.1 Index]