pclust {clue} | R Documentation |
Prototype-Based Partitioning
Description
Obtain prototype-based partitions of objects by minimizing the criterion
\sum w_b u_{bj}^m d(x_b, p_j)^e
, the sum of the case-weighted and
membership-weighted e
-th powers of the dissimilarities between
the objects x_b
and the prototypes p_j
, for suitable
dissimilarities d
and exponents e
.
Usage
pclust(x, k, family, m = 1, weights = 1, control = list())
pclust_family(D, C, init = NULL, description = NULL, e = 1,
.modify = NULL, .subset = NULL)
pclust_object(prototypes, membership, cluster, family, m = 1,
value, ..., classes = NULL, attributes = NULL)
Arguments
x |
the object to be partitioned. |
k |
an integer giving the number of classes to be used in the partition. |
family |
an object of class |
m |
a number not less than 1 controlling the softness of the
partition (as the “fuzzification parameter” of the fuzzy
|
weights |
a numeric vector of non-negative case weights.
Recycled to the number of elements given by |
control |
a list of control parameters. See Details. |
D |
a function for computing dissimilarities |
C |
a ‘consensus’ function with formals |
init |
a function with formals |
description |
a character string describing the family. |
e |
a number giving the exponent |
.modify |
a function with formals |
.subset |
a function with formals |
prototypes |
an object representing the prototypes of the partition. |
membership |
an object of class |
cluster |
the class ids of the nearest hard partition. |
value |
the value of the criterion to be minimized. |
... |
further elements to be included in the generated pclust object. |
classes |
a character vector giving further classes to be given
to the generated pclust object in addition to |
attributes |
a list of attributes, or |
Details
For m = 1
, a generalization of the Lloyd-Forgy variant of the
k
-means algorithm is used, which iterates between reclassifying
objects to their closest prototypes (according to the dissimilarities
given by D
), and computing new prototypes as the consensus for
the classes (using C
).
For m > 1
, a generalization of the fuzzy c
-means recipe
(e.g., Bezdek (1981)) is used, which alternates between computing
optimal memberships for fixed prototypes, and computing new prototypes
as the suitably weighted consensus clusterings for the classes.
This procedure is repeated until convergence occurs, or the maximal number of iterations is reached.
Currently, no local improvement heuristics are provided.
It is possible to perform several runs of the procedure via control
arguments nruns
or start
(the default is to perform a
single run), in which case the first partition with the smallest
value of the criterion is returned.
The dissimilarity and consensus functions as well as the exponent
e
are specified via family
. In principle, arbitrary
representations of objects to be partitioned and prototypes (which do
not necessarily have to be “of the same kind”) can be employed.
In addition to D
and C
, what is needed are means to
obtain an initial collection of k
prototypes (init
), to
modify a single prototype (.modify
), and subset the prototypes
(.subset
). By default, list and (currently, only dense) matrix
(with the usual convention that the rows correspond to the objects)
are supported. Otherwise, the family has to provide the functions
needed.
Available control parameters are as follows.
maxiter
an integer giving the maximal number of iterations to be performed. Defaults to 100.
nruns
an integer giving the number of runs to be performed. Defaults to 1.
reltol
the relative convergence tolerance. Defaults to
sqrt(.Machine$double.eps)
.start
a list of prototype objects to be used as starting values.
verbose
a logical indicating whether to provide some output on minimization progress. Defaults to
getOption("verbose")
.control
control parameters to be used in the consensus function.
The fixed point approach employed is a heuristic which cannot be
guaranteed to find the global minimum, in particular if C
is
not an exact minimizer. Standard practice would recommend to use the
best solution found in “sufficiently many” replications of the
base algorithm.
Value
pclust
returns the partition found as an object of class
"pclust"
(as obtained by calling pclust_object
) which in
addition to the default components contains call
(the
matched call) and a converged
attribute indicating convergence
status (i.e., whether the maximal number of iterations was reached).
pclust_family
returns an object of class
"pclust_family"
, which is a list with components corresponding
to the formals of pclust_family
.
pclust_object
returns an object inheriting from class
"pclust"
, which is a list with components corresponding
to the formals (up to and including ...
) of
pclust_object
, and additional classes and attributes specified
by classes
and attributes
, respectively.
References
J. C. Bezdek (1981). Pattern recognition with fuzzy objective function algorithms. New York: Plenum.