WH_adaptive_fcmeans {HistDAWass}R Documentation

Fuzzy c-means with adaptive distances for histogram-valued data

Description

Fuzzy c-means of a dataset of histogram-valued data using different adaptive distances based on the L2 Wasserstein metric.

Usage

WH_adaptive_fcmeans(
  x,
  k = 5,
  schema,
  m = 1.6,
  rep,
  simplify = FALSE,
  qua = 10,
  standardize = FALSE,
  init.weights = "EQUAL",
  weight.sys = "PROD",
  theta = 2,
  verbose = FALSE
)

Arguments

x

A MatH object (a matrix of distributionH).

k

An integer, the number of groups.

schema

An integer. 1=one weight per variable, 2=two weights per variables (one for each component: the mean and the variability component), 3=one weight per variable and per cluster, 4= two weights per variable and per cluster.

m

A number grater than 0, a fuzziness coefficient (default m=1.6).

rep

An integer, maximum number of repetitions of the algorithm (default rep=5).

simplify

A logic value (default is FALSE), if TRUE histograms are recomputed in order to speed-up the algorithm.

qua

An integer, if simplify=TRUE is the number of quantiles used for recodify the histograms.

standardize

A logic value (default is FALSE). If TRUE, histogram-valued data are standardized, variable by variable, using the Wassertein based standard deviation. Use if one wants to have variables with std equal to one.

init.weights

A string. (default='EQUAL'). EQUAL, all variables or components have the same weight; 'RANDOM', a random assignment is done.

weight.sys

A string. (default='PROD') PROD, Weights product is equal to one. SUM, the weights sum up to one.

theta

A number. (default=2) A parameter for the system of weights summing up to one.

verbose

A logic value (default is FALSE). If TRUE some details are provided.

Value

The results of the fuzzy c-means of the set of Histogram-valued data x into k cluster.

solution

A list.Returns the best solution among the repetitions, i.e. the ona having the minimum sum of squares deviation.

solution$membership

A matrix. The membership degree of each unit to each cluster.

solution$IDX

A vector. The crisp assignement to a cluster.

solution$cardinality

A vector. The cardinality of each final cluster (after the crisp assignement).

solution$Crit

A number. The criterion (Sum od square deviation from the prototypes) value at the end of the run.

quality

A number. The percentage of Sum of square deviation explained by the model. (The higher the better)

Examples

results <- WH_adaptive_fcmeans(
  x = BLOOD, k = 2, schema = 4, m = 1.5, rep = 3, simplify = TRUE,
  qua = 10, standardize = TRUE, init.weights = "EQUAL", weight.sys = "PROD"
)

[Package HistDAWass version 1.0.8 Index]