clusterCons-package {clusterCons} | R Documentation |
Calculate consensus clustering results from re-sampled clustering experiments with the option of using multiple algorithms and parameters
Description
clusterCons is a package containing functions that generate robustness measures for clusters and cluster membership based on generating consensus matrices from bootstrapped clustering experiments in which a random proportion of rows of the data set are used in each individual clustering. This allows the user to prioritise clusters and the members of clusters based on their consistency in this regime. The functions allow the user to select several algorithms to use in the re-sampling scheme and with any of the parameters that the algorithm would normally take.
Details
Package: | clusterCons |
Type: | Package |
Version: | 1.0 |
Date: | 2010-10-12 |
License: | GPL |
LazyLoad: | yes |
Depends: | methods,cluster,lattice,RColorBrewer,grid,apcluster |
Extends: | cluster |
Suggests: | latticeExtra |
The user should first prepare an entirely numeric data.frame
in which the conditions to be clustered are the column names and the unique ids
of the entities are the row names. Compatibility of the resulting data.fram can be checked by using the data_check
function.
Functions to run the consensus clustering and retrieve robustness information
cluscomp
- generate consensus matrices from re-sampled clustering experiments with the option of multiple algorithms and parameters
clrob
- calculate the robustness of the clusters from the consensus matrix
memrob
- calculate the cluster membership robustness from the consensus matrix
Internal functions to call the individual clustering algorithms
agnes_clmem
- wrapper for the agnes
function of package cluster
diana_clmem
- wrapper for the diana
function of package cluster
hclust_clmem
- wrapper for the hclust
function of package cluster
kmeans_clmem
- wrapper for the kmeans
function of package cluster
pam_clmem
- wrapper for the pam
function of package cluster
apcluster_clmem
- wrapper for the apclusterK
function of package apcluster
Functions to calculate AUC related metrics
auc
- calculates the area under the curve for a series of clustering experiments with the same cluster number
aucs
- calculates the areas under the curves of a series of clustering experiments over a range of cluster numbers
deltak
- calculates the change in the area under the curve
Functions to check data and object validity
data_check
- check that the provided data.frame
is formatted correctly
expSetProcess
- extracts the data set from an object of class expressionSet
validConsMatrixObject
- check the validity of a consmatrix
object
validMergeMatrixObject
- check the validity of a mergematrix
object
validMemRobListObject
- check the validity of a membership robustness list object
validMemRobMatrixObject
- check the validity of a membership robustness matrix object
validAUCObject
- check the validity of an "auc"
class object
validDkObject
- check the validity of an "dk"
class object
Functions to plot out performance curves
aucplot
- plot area under the curve (AUC) plots from consensus clustering results
dkplot
- plot change in AUC by cluster number (delta-K plot)
expressionPlot
- plot the original data partitioned by cluster membership
membBoxPlot
- plot a box and whisker plot of the membership robustness for each cluster
Keywords
cluster
See Also
Examples
#load data data(sim_profile);
#perform consensus clustering cmr <- cluscomp(sim_profile,algo=list('agnes','pam','kmeans'),clmin=2,clmax=7,rep=10,merge=1);
#see the consensus and merge matrices summary(cmr);
#fetch the cluster robustness for agnes consensus clustering with k=3 clrob(cmr$e1_agnes_k3);
#show the membership robustness for cluster 1 memrob(cmr$e1_agnes_k3)$cluster1
#show the same, but for the merge against the k=3 agnes clustering structure #note we provide the reference matrix (which is the original cluster membership for agnes where k=3) clrob(cmr$merge_k3,cmr$e1_agnes_k3@rm); memrob(cmr$merge_k3,cmr$e1_agnes_k3@rm)$cluster1;
#calculate the AUCs acs <- aucs(cmr);
#plot the AUC curves aucplot(acs);
#calculate the delta-Ks dks <- deltak(acs);
#plot the delta-K curves dkplot(dks);
#plot the expression profiles expressionPlot(sim_profile,cmr$e1_agnes_k3);
#plot the bwplot of membership robustness for the same membBoxPlot(memrob(cmr$e1_agnes_k3));
Author(s)
Dr. T. Ian Simpson ian.simpson@ed.ac.uk
References
Merged consensus clustering to assess and improve class discovery with microarray data. Simpson TI, Armstrong JD and Jarman AP. BMC Bioinformatics 2010, 11:590.
Consensus clustering: A resampling-based method for class discovery and visualization of gene expression microarray data. Monti, S., Tamayo, P., Mesirov, J. and Golub, T. Machine Learning, 52, July 2003.