dbs {pdfCluster} | R Documentation |
Density-based silhouette information methods
Description
Computes the density-based silhouette information of clustered data. Two methods are associated to this function. The first
method applies to two arguments: the matrix of data and the vector of cluster labels; the second method applies to objects of
pdfCluster-class
.
Usage
## S4 method for signature 'matrix'
dbs(x, clusters, h.funct="h.norm", hmult=1, prior, ...)
## S4 method for signature 'pdfCluster'
dbs(x, h.funct="h.norm", hmult = 1, prior =
as.vector(table(x@cluster.cores)/sum(table(x@cluster.cores))),
stage=NULL, ...)
Arguments
x |
A matrix of data points partitioned by any density-based clustering method or an object of |
clusters |
Cluster labels of grouped data. This argument has not to be set when |
h.funct |
Function to estimate the smoothing parameters. Default is |
hmult |
Shrink factor to be multiplied by the smoothing parameters. Default value is 1. |
prior |
Vector of prior probabilities of belonging to the groups. When |
stage |
When |
... |
Further arguments to be passed to methods (see |
Details
This function provides diagnostics for a clustering produced by any density-based clustering method. The dbs
information is a suitable modification of the silhouette
information aimed at evaluating
the cluster quality in a density based framework. It is based on the estimation of data posterior probabilities of belonging to the clusters. It may be
used to measure the quality of data allocation to the clusters. High values of the are evidence of a good quality clustering.
Define
where is a prior probability of
and
is a density estimate at
evaluated with function
kepdf
by using the only data points in . Density estimation is performed with fixed bandwidths
h
, as evaluated by function h.funct
, possibly multiplied by the shrink factor hmult
.
Density-based silhouette information of , the
row of the data matrix
x
, is defined as follows:
where is the group where
has been allocated and
is the group for which
is maximum,
.
Note: when there exists such that
is zero,
is forced to 1 and
is computed by excluding
from the data matrix
x
.
See Menardi (2011) for a detailed treatment.
Value
An object of class "dbs"
, with slots:
call |
The matched call. |
x |
The matrix of clustered data points. |
prior |
The vector of prior probabilities of belonging to the groups. |
dbs |
A vector reporting the density-based silhouette information of the clustered data. |
clusters |
Cluster labels of grouped data. |
noc |
Number of clusters |
stage |
If argument |
See dbs-class
for more details.
Methods
signature(x = "matrix", clusters = "numeric")
-
Computes the density based silhouette information for objects partitioned according to any density-based clustering method.
signature(x = "pdfCluster", clusters = "missing")
-
Computes the density based silhouette information for objects of class
"pdfCluster"
.
References
Menardi, G. (2011) Density-based Silhouette diagnostics for clustering methods. Statistics and Computing, 21, 295-308.
See Also
dbs-class
, plot,dbs-method
, silhouette
.
Examples
#example 1: no groups in data
#random generation of group labels
set.seed(54321)
x <- rnorm(50)
groups <- sample(1:2, 50, replace = TRUE)
groups
dsil <- dbs(x = as.matrix(x), clusters=groups)
dsil
summary(dsil)
plot(dsil, labels=TRUE, lwd=6)
#example 2: wines data
# load data
data(wine)
# select a subset of variables
x <- wine[, c(2,5,8)]
#clustering
cl <- pdfCluster(x)
dsil <- dbs(cl)
plot(dsil)