interval_normalization {clusterSim} | R Documentation |
Types of normalization formulas for interval-valued symbolic variables
Description
Types of normalization formulas for interval-valued symbolic variables
Usage
interval_normalization(x,dataType="simple",type="n0",y=NULL,...)
Arguments
x |
matrix dataset or symbolic table object |
dataType |
Type of symbolic data table passed to function, 'sda' - full symbolicDA format object; 'simple' - three dimensional array with lower and upper bound of intervals in third dimension; 'separate_tables' - lower bounds of intervals in 'rows' - lower and upper bound of intervals in neighbouring rows; 'columns' - lower and upper bound of intervals in neighbouring columns |
type |
type of normalization: |
n0 - without normalization
n1 - standardization ((x-mean)/sd)
n2 - positional standardization ((x-median)/mad)
n3 - unitization ((x-mean)/range)
n3a - positional unitization ((x-median)/range)
n4 - unitization with zero minimum ((x-min)/range)
n5 - normalization in range <-1,1> ((x-mean)/max(abs(x-mean)))
n5a - positional normalization in range <-1,1> ((x-median)/max(abs(x-median)))
n6 - quotient transformation (x/sd)
n6a - positional quotient transformation (x/mad)
n7 - quotient transformation (x/range)
n8 - quotient transformation (x/max)
n9 - quotient transformation (x/mean)
n9a - positional quotient transformation (x/median)
n10 - quotient transformation (x/sum)
n11 - quotient transformation (x/sqrt(SSQ))
n12 - normalization ((x-mean)/sqrt(sum((x-mean)^2)))
n12a - positional normalization ((x-median)/sqrt(sum((x-median)^2)))
n13 - normalization with zero being the central point ((x-midrange)/(range/2))
y |
matrix or dataset with upper bounds of intervals if argument |
... |
arguments passed to |
Value
Normalized data
Author(s)
Marek Walesiak marek.walesiak@ue.wroc.pl, Andrzej Dudek andrzej.dudek@ue.wroc.pl
Department of Econometrics and Computer Science, University of Economics, Wroclaw, Poland
References
Jajuga, K., Walesiak, M. (2000), Standardisation of data set under different measurement scales, In: R. Decker, W. Gaul (Eds.), Classification and information processing at the turn of the millennium, Springer-Verlag, Berlin, Heidelberg, 105-112. Available at: doi:10.1007/978-3-642-57280-7_11.
Milligan, G.W., Cooper, M.C. (1988), A study of standardization of variables in cluster analysis, "Journal of Classification", vol. 5, 181-204. Available at: doi:10.1007/BF01897163.
Walesiak, M. (2014), Przeglad formul normalizacji wartosci zmiennych oraz ich wlasnosci w statystycznej analizie wielowymiarowej [Data normalization in multivariate data analysis. An overview and properties], "Przeglad Statystyczny" ("Statistical Review"), vol. 61, no. 4, 363-372.
Walesiak, M., Dudek, A. (2017), Selecting the Optimal Multidimensional Scaling Procedure for Metric Data with R Environment, „STATISTICS IN TRANSITION new series”, September, Vol. 18, No. 3, pp. 521-540. Available at: https://stat.gov.pl/en/sit-en/issues-and-articles-sit/.
See Also
Examples
library(clusterSim)
data(data_symbolic_interval_polish_voivodships)
n<-interval_normalization(data_symbolic_interval_polish_voivodships,dataType="simple",type="n2")
plotInterval(n$simple)