fast_scale {dataPreparation} | R Documentation |
Perform efficient scaling on a data set.
fast_scale(data_set, scales = NULL, way = "scale", verbose = TRUE)
data_set |
Matrix, data.frame or data.table |
scales |
Result of function |
way |
should scaling or unscaling be performed? (character either "scale" or "unscale", default to "scale") |
verbose |
Should the algorithm talk? (Logical, default to TRUE) |
Scaling numeric values is useful for some machine learning algorithm such as
logistic regression or neural networks.
Unscaling numeric values can be very useful for most post-model analysis to do so set way to "unscale".
This implementation of scale will be faster that scale
for large data sets.
data_set
with columns scaled (or unscaled) by reference. Scaled means that each
column mean will be 0 and each column standard deviation will be 1.
# Load data
data(adult)
# compute scales
scales <- build_scales(adult, cols = "auto", verbose = TRUE)
# Scale data set
adult <- fast_scale(adult, scales = scales, verbose = TRUE)
# Control
print(mean(adult$age)) # Almost 0
print(sd(adult$age)) # 1
# To unscale it:
adult <- fast_scale(adult, scales = scales, way = "unscale", verbose = TRUE)
# Control
print(mean(adult$age)) # About 38.6
print(sd(adult$age)) # About 13.6