sil.score {bios2mds} | R Documentation |

Computes silhouette scores for multiple runs of K-means clustering.

```
sil.score(mat, nb.clus = c(2:13), nb.run = 100, iter.max = 1000,
method = "euclidean")
```

`mat` |
a numeric matrix representing the coordinates of the elements. |

`nb.clus` |
a numeric vector indicating the range of the numbers of clusters. Default is c(2:13). |

`nb.run` |
a numeric value indicating the number of runs. Default is 100. |

`iter.max` |
a numeric value indicating the maximum number of iterations for K-means clustering. Default is 1000. |

`method` |
a string of characters to determine the distance measure. This should be one of "euclidean" , "maximum", "manhattan", "canberra" or "binary". Default is "euclidean". |

Silhouettes are a general graphical aid for interpretation and validation of cluster analysis.
This technique is available through the `silhouette`

function (`cluster`

package). In order to
calculate silhouettes, two types of data are needed:

the collection of all distances between objects. These distances are obtained from application of

`dist`

function on the coordinates of the elements in`mat`

with argument`method`

.the partition obtained by the application of a clustering technique. In

`sil.score`

context, the partition is obtained from the`Kmeans`

function (`amap`

package) with argument`method`

which indicates the cluster to which each element is assigned.

For each element, a silhouette value is calculated and evaluates the degree of confidence in the assignment of the element:

well-clustered elements have a score near 1,

poorly-clustered elements have a score near -1.

Thus, silhouettes indicates the objects that are well or poorly clustered. To summarize the results, for each cluster, the silhouettes values can be displayed as an **average silhouette width**, which is the mean of silhouettes for all the elements assigned to this cluster. Finally, the **overall average silhouette** width is the mean of average silhouette widths of the different clusters.

Silhouette values offer the advantage that they depend only on the partition of the elements. As a consequence, silhouettes can be used to compare the output of the same clustering algorithm applied
to the same data but for different numbers of clusters. A range of numbers of clusters can be tested, with the `nb.clus`

argument. The optimal number of clusters is reached for the maximum of the overall
silhouette width. This means that the clustering algorithm reaches a strong clustering structure.
However, for a given number of clusters, the cluster assignment obtained by different K-means runs can be different because the K-means procedure assigns random initial centroids for each run. It may be necessary to run the K-means procedure several times, with the nb.run argument, to evaluate the uncertainty of the results. In that case, for each number of clusters, the mean of the overall average silhouettes for `nb.run`

K-means runs is calculated. The maximum of this core gives the optimal number of clusters.

A named numeric vector representing the silhouette scores for each number of clusters.

`sil.score`

requires `Kmeans`

and `silhouette`

functions from `amap`

and
`cluster`

packages, respectively.

Julien Pele

Rousseeuw PJ (1987) Silhouettes: A Graphical Aid to the Interpretation and Validation of
Cluster Analysis. *Journal of Computational and Applied Mathematics*, **20**:53-65.

Lovmar L, Ahlford A, Jonsson M and Syvanen AC (2005) Silhouette scores for assessment
of SNP genotype clusters. *BMC Genomics*, **6**:35.

Guy B, Vasyl P, Susmita D and Somnath D (2008) clValid: An R Package for Cluster Validation.
*Journal of Statistical Software*, **25**.

`connectivity`

and `dunn`

functions from `clValid`

package.

`silhouette`

function from `cluster`

package.

```
# calculating silhouette scores for K-means clustering of human GPCRs:
data(gpcr)
active <- gpcr$dif$sapiens.sapiens
mds <- mmds(active)
sil.score1 <- sil.score(mds$coord, nb.clus = c(2:10),
nb.run = 100, iter.max = 100)
barplot(sil.score1)
```

[Package *bios2mds* version 1.2.3 Index]