Automated Data Preparation

[Up] [Top]

Documentation for package ‘dataPreparation’ version 1.1.1

Help Pages

adult Adult for UCI repository
aggregate_by_key Automatic data_set aggregation by key
as.POSIXct_fast Faster date transformation
build_bins Compute bins
build_date_factor Date Factor
build_encoding Compute encoding
build_scales Compute scales
build_target_encoding Build target encoding
compute_probability_ratio Compute probability ratio
compute_weight_of_evidence Compute weight of evidence
data_preparation_news Show the NEWS file
date_format_unifier Unify dates format
description Describe data set
fast_discretization Discretization
fast_filter_variables Filtering useless variables
fast_handle_na Handle NA values
fast_is_equal Fast checks of equality
fast_round Fast round
fast_scale scale
find_and_transform_dates Identify date columns
find_and_transform_numerics Identify numeric columns in a data_set set
generate_date_diffs Date difference
generate_factor_from_date Generate factor from dates
generate_from_character Recode character
generate_from_factor Recode factor
get_most_frequent_element Get most frequent element
identify_dates Identify date columns
messy_adult Adult with some ugly columns added
one_hot_encoder One hot encoder
prepare_set Preparation pipeline
remove_percentile_outlier Percentile outlier filtering
remove_rare_categorical Filter rare categories
remove_sd_outlier Standard deviation outlier filtering
same_shape Give same shape
set_as_numeric_matrix Numeric matrix preparation for Machine Learning.
set_col_as_character Set columns as character
set_col_as_date Set columns as POSIXct
set_col_as_factor Set columns as factor
set_col_as_numeric Set columns as numeric
shape_set Final preparation before ML algorithm
target_encode Target encode
tiny_messy_adult First 500 rows of 'messy_adult'
un_factor Unfactor factor with too many values
which_are_bijection Identify bijections
which_are_constant Identify constant columns
which_are_included Identify columns that are included in others
which_are_in_double Identify double columns