RobustNormalization {DatabionicSwarm}R Documentation

RobustNormalization

Description

RobustNormalization as described in [Milligan/Cooper, 1988].

Usage

RobustNormalization(Data,Centered=FALSE,Capped=FALSE,

na.rm=TRUE,WithBackTransformation=FALSE,

pmin=0.01,pmax=0.99) 

Arguments

Data

[1:n,1:d] data matrix of n cases and d features

Centered

centered data around zero by median if TRUE

Capped

TRUE: outliers are capped above 1 or below -1 and set to 1 or -1.

na.rm

If TRUE, infinite vlaues are disregarded

WithBackTransformation

If in the case for forecasting with neural networks a backtransformation is required, this parameter can be set to 'TRUE'.

pmin

defines outliers on the lower end of scale

pmax

defines outliers on the higher end of scale

Details

Normalizes features either between -1 to 1 (Centered=TRUE) or 0-1 (Centered=TRUE) without changing the distribution of a feature itself. For a more precise description please read [Thrun, 2018, p.17].

"[The] scaling of the inputs determines the effective scaling of the weights in the last layer of a MLP with BP neural netowrk, it can have a large effect on the quality of the final solution. At the outset it is besto to standardize all inputs to have mean zero and standard deviation 1 [(or at least the range under 1)]. This ensures all inputs are treated equally in the regularization prozess, and allows to choose a meaningful range for the random starting weights."[Friedman et al., 2012]

Value

if WithBackTransformation=FALSE: TransformedData[1:n,1:d] i.e., normalized data matrix of n cases and d features

if WithBackTransformation=TRUE: List with

TransformedData

[1:n,1:d] normalized data matrix of n cases and d features

MinX

[1:d] numerical vector used for manual back-transformation of each feature

MaxX

[1:d] numerical vector used for manual back-transformation of each feature

Denom

[1:d] numerical vector used for manual back-transformation of each feature

Center

[1:d] numerical vector used for manual back-transformation of each feature

Author(s)

Michael Thrun

References

[Milligan/Cooper, 1988] Milligan, G. W., & Cooper, M. C.: A study of standardization of variables in cluster analysis, Journal of Classification, Vol. 5(2), pp. 181-204. 1988.

[Friedman et al., 2012] Friedman, J., Hastie, T., & Tibshirani, R.: The Elements of Statistical Learning, (Second ed. Vol. 1), Springer series in statistics New York, NY, USA:, ISBN, 2012.

[Thrun, 2018] Thrun, M. C.: Projection Based Clustering through Self-Organization and Swarm Intelligence, doctoral dissertation 2017, Springer, Heidelberg, ISBN: 978-3-658-20539-3, doi:10.1007/978-3-658-20540-9, 2018.

See Also

RobustNorm_BackTrafo

Examples

Scaled = RobustNormalization(rnorm(1000, 2, 100), Capped = TRUE)
hist(Scaled)

m = cbind(c(1, 2, 3), c(2, 6, 4))
List = RobustNormalization(m, FALSE, FALSE, FALSE, TRUE)
TransformedData = List$TransformedData

mback = RobustNorm_BackTrafo(TransformedData, List$MinX, List$Denom, List$Center)

sum(m - mback)

[Package DatabionicSwarm version 2.0.0 Index]