preprocess {citrus} | R Documentation |
Preprocess Function
Description
Transforms a transactional table into an id aggregated table with custom options for aggregation methods for numeric and categorical columns.
Usage
preprocess(
df,
samplesize = NA,
numeric_operation_list = c("mean"),
categories = NULL,
target = NA,
target_agg = "mean",
verbose = TRUE
)
Arguments
df |
data.frame, the data to preprocess |
samplesize |
numeric, the fraction of ids used to create a sub-sample of the input df |
numeric_operation_list |
list, a list of the aggregation functions to apply to numeric columns |
categories |
list, a list of the categorical columns to aggregate |
target |
character, the column to use as a response variable for supervised learning |
target_agg |
character, the aggregation function to use to aggregate the target column |
verbose |
logical whether information about the preprocessing should be given |
Value
An id attributes data frame, e.g. customer attributes if the id represents customer IDs. A single row per unique id.
[Package citrus version 1.0.2 Index]