clusterCons-package {clusterCons}R Documentation

Calculate consensus clustering results from re-sampled clustering experiments with the option of using multiple algorithms and parameters

Description

clusterCons is a package containing functions that generate robustness measures for clusters and cluster membership based on generating consensus matrices from bootstrapped clustering experiments in which a random proportion of rows of the data set are used in each individual clustering. This allows the user to prioritise clusters and the members of clusters based on their consistency in this regime. The functions allow the user to select several algorithms to use in the re-sampling scheme and with any of the parameters that the algorithm would normally take.

Details

Package: clusterCons
Type: Package
Version: 1.0
Date: 2010-10-12
License: GPL
LazyLoad: yes
Depends: methods,cluster,lattice,RColorBrewer,grid,apcluster
Extends: cluster
Suggests: latticeExtra

The user should first prepare an entirely numeric data.frame in which the conditions to be clustered are the column names and the unique ids of the entities are the row names. Compatibility of the resulting data.fram can be checked by using the data_check function.

Functions to run the consensus clustering and retrieve robustness information

cluscomp - generate consensus matrices from re-sampled clustering experiments with the option of multiple algorithms and parameters
clrob - calculate the robustness of the clusters from the consensus matrix
memrob - calculate the cluster membership robustness from the consensus matrix

Internal functions to call the individual clustering algorithms

agnes_clmem - wrapper for the agnes function of package cluster
diana_clmem - wrapper for the diana function of package cluster
hclust_clmem - wrapper for the hclust function of package cluster
kmeans_clmem - wrapper for the kmeans function of package cluster
pam_clmem - wrapper for the pam function of package cluster
apcluster_clmem - wrapper for the apclusterK function of package apcluster

Functions to calculate AUC related metrics

auc - calculates the area under the curve for a series of clustering experiments with the same cluster number
aucs - calculates the areas under the curves of a series of clustering experiments over a range of cluster numbers
deltak - calculates the change in the area under the curve

Functions to check data and object validity

data_check - check that the provided data.frame is formatted correctly
expSetProcess - extracts the data set from an object of class expressionSet
validConsMatrixObject - check the validity of a consmatrix object
validMergeMatrixObject - check the validity of a mergematrix object
validMemRobListObject - check the validity of a membership robustness list object
validMemRobMatrixObject - check the validity of a membership robustness matrix object
validAUCObject - check the validity of an "auc" class object
validDkObject - check the validity of an "dk" class object

Functions to plot out performance curves

aucplot - plot area under the curve (AUC) plots from consensus clustering results
dkplot - plot change in AUC by cluster number (delta-K plot)
expressionPlot - plot the original data partitioned by cluster membership
membBoxPlot - plot a box and whisker plot of the membership robustness for each cluster

Keywords

cluster

See Also

cluster,lattice,apcluster

Examples

#load data data(sim_profile);

#perform consensus clustering cmr <- cluscomp(sim_profile,algo=list('agnes','pam','kmeans'),clmin=2,clmax=7,rep=10,merge=1);

#see the consensus and merge matrices summary(cmr);

#fetch the cluster robustness for agnes consensus clustering with k=3 clrob(cmr$e1_agnes_k3);

#show the membership robustness for cluster 1 memrob(cmr$e1_agnes_k3)$cluster1

#show the same, but for the merge against the k=3 agnes clustering structure #note we provide the reference matrix (which is the original cluster membership for agnes where k=3) clrob(cmr$merge_k3,cmr$e1_agnes_k3@rm); memrob(cmr$merge_k3,cmr$e1_agnes_k3@rm)$cluster1;

#calculate the AUCs acs <- aucs(cmr);

#plot the AUC curves aucplot(acs);

#calculate the delta-Ks dks <- deltak(acs);

#plot the delta-K curves dkplot(dks);

#plot the expression profiles expressionPlot(sim_profile,cmr$e1_agnes_k3);

#plot the bwplot of membership robustness for the same membBoxPlot(memrob(cmr$e1_agnes_k3));

Author(s)

Dr. T. Ian Simpson ian.simpson@ed.ac.uk

References

Merged consensus clustering to assess and improve class discovery with microarray data. Simpson TI, Armstrong JD and Jarman AP. BMC Bioinformatics 2010, 11:590.

Consensus clustering: A resampling-based method for class discovery and visualization of gene expression microarray data. Monti, S., Tamayo, P., Mesirov, J. and Golub, T. Machine Learning, 52, July 2003.


[Package clusterCons version 1.2 Index]