WH_2d_Adaptive_Kohonen_maps {HistDAWass} | R Documentation |
Batch Kohonen self-organizing 2d maps using adaptive distances for histogram-valued data
Description
The function implements a Batch Kohonen self-organizing 2d maps algorithm for histogram-valued data.
Usage
WH_2d_Adaptive_Kohonen_maps(
x,
net = list(xdim = 4, ydim = 3, topo = c("rectangular")),
kern.param = 2,
TMAX = -9999,
Tmin = -9999,
niter = 30,
repetitions,
simplify = FALSE,
qua = 10,
standardize = FALSE,
schema = 6,
init.weights = "EQUAL",
weight.sys = "PROD",
theta = 2,
Wfix = FALSE,
verbose = FALSE,
atleast = 2
)
Arguments
x |
A MatH object (a matrix of distributionH). |
net |
a list describing the topology of the net |
kern.param |
(default =2) the kernel parameter for the RBF kernel used in the algorithm |
TMAX |
a parameter useful for the iterations (default=2) |
Tmin |
a parameter useful for the iterations (default=0.2) |
niter |
maximum number of iterations (default=30) |
repetitions |
number of repetion of the algorithm (default=5), beacuase each launch may generate a local optimum |
simplify |
a logical parameter for speeding up computations (default=FALSE). If true data are recoded in order to have fast computations |
qua |
if |
standardize |
A logic value (default is FALSE). If TRUE, histogram-valued data are standardized, variable by variable, using the Wassertein based standard deviation. Use if one wants to have variables with std equal to one. |
schema |
a number from 1 to 4 |
init.weights |
a string how to initialize weights: 'EQUAL' (default), all weights are the same, |
weight.sys |
a string. Weights may add to one ('SUM') or their product is equal to 1 ('PROD', default). |
theta |
a number. A parameter if |
Wfix |
a logical parameter (default=FALSE). If TRUE the algorithm does not use adaptive distances. |
verbose |
a logical parameter (default=FALSE). If TRUE details of computation are shown during the execution. #' |
atleast |
integer. Check for degeneration of the map into a very low number of voronoi sets. (default 2) 2 means that the map will have at least 2 neurons attracting data instances in their voronoi sets. |
Details
An extension of Batch Self Organised Map (BSOM) is here proposed for histogram data. These kind of data have been defined in the context of symbolic data analysis. The BSOM cost function is then based on a distance function: the L2 Wasserstein distance. This distance has been widely proposed in several techniques of analysis (clustering, regression) when input data are expressed by distributions (empirical by histograms or theoretical by probability distributions). The peculiarity of such distance is to be an Euclidean distance between quantile functions so that all the properties proved for L2 distances are verified again. An adaptative versions of BSOM is also introduced considering an automatic system of weights in the cost function in order to take into account the different effect of the several variables in the Self-Organised Map grid.
Value
a list with the results of the Batch Kohonen map
Slots
solution
A list.Returns the best solution among the
repetitions
etitions, i.e. the one having the minimum sum of squares criterion.solution$MAP
The map topology.
solution$IDX
A vector. The clusters at which the objects are assigned.
solution$cardinality
A vector. The cardinality of each final cluster.
solution$proto
A
MatH
object with the description of centers.solution$Crit
A number. The criterion (Sum od square deviation from the centers) value at the end of the run.
solution$Weights.comp
the final weights assigned to each component of the histogram variables
solution$Weight.sys
a string the type of weighting system ('SUM' or 'PRODUCT')
quality
A number. The percentage of Sum of square deviation explained by the model. (The higher the better)
References
Irpino A, Verde R, De Carvalho FAT (2012). Batch self organizing maps for interval and histogram data. In: Proceedings of COMPSTAT 2012. p. 143-154, ISI/IASC, ISBN: 978-90-73592-32-2
Examples
## Not run:
results <- WH_2d_Adaptive_Kohonen_maps(
x = BLOOD,
net = list(xdim = 2, ydim = 3, topo = c("rectangular")),
repetitions = 2, simplify = TRUE,
qua = 10, standardize = TRUE
)
## End(Not run)