plot.progenyClust {progenyClust} | R Documentation |
Plot Progeny Clustering Results
Description
Plot the cluster number selection results and visualizes the clustering results.
Usage
## S3 method for class 'progenyClust'
plot(x,data=NULL,k=NULL,errorbar=FALSE,xlab='',ylab='',...)
Arguments
x |
a |
data |
the full or a subset of the oringal data matrix that was used for clustering. If unspecified, the function will plot stability scores for cluster number selection; If specified, the function will plot the data in scatter plots with colors annotated by clustering memberships (Please see details below). |
k |
integer specifying the cluster number for visualizing the clustering results of original data: only takes into effect when argument |
errorbar |
logical flag: specifies whether the error bars should be drawn. The error bars can only be drawn when progeny clustering is repeated multiple times, i.e. input argument "repeats" in function progenyClust is greater than 1. |
xlab |
character string specifying the name of the x axis. |
ylab |
character string specifying the name of the y axis. |
... |
additional graphical arguments in |
Details
The plot function provides two types of visualization: (1) visualizing stability scores, and (2) visualizing clustering results.
To visualize the stability scores that are output from progenyClust
function, please run the plot function without specifying the input argument data
. The resulting plot visualizes the stability score at each cluster number. This plot can provide an overview of clustering stability, and can facilitate selecting the optimal cluster number.
The plot function can also visualize the clustering results in scatter plots by specifying the input argument data
. Since the goal is to view how the original data is clustered with certain cluster number, data
needs to contain exactly the same number of samples as in the original data that was used to run the progenyClust
function. If data
contains more than two features, a table of scatter plots will be created to show clustering results within each pair of dimensions. data
with more than 20 features/columns will not be accepted, but a subset of data
with selected features can be used in this case. The input argument k
specifies the cluster number at which the clustering result is shown. Note that k
needs to be a cluster number that was previously examined by progenyClust
when generating the progenyClust
object x
. If k
is not provided, the function will use the optimal cluster number determined by the Gap criterion only if method='gap'
, and will use the optimal number determined by the Score criterion if method='gap'
or method='both'
when running progenyClust
.
Value
returns plots as described in Details.
Author(s)
C.W. Hu, Rice University
References
Hu, C.W., et al. "Progeny Clustering: A Method to Identify Biological Phenotypes." Scientific reports 5 (2015).
http://www.nature.com/articles/srep12894
Examples
# a 3-cluster 2-dimensional example dataset
data('test')
# default progeny clsutering
progenyClust(test,ncluster=2:5)->pc
# plot the scores to select the optimal cluster number
plot(pc)
# plot the clustering results with the optimal cluster number
plot(pc,test)