R: Plot Progeny Clustering Results

plot.progenyClust {progenyClust}

R Documentation

Plot Progeny Clustering Results

Description

Plot the cluster number selection results and visualizes the clustering results.

Usage

## S3 method for class 'progenyClust'
plot(x,data=NULL,k=NULL,errorbar=FALSE,xlab='',ylab='',...)

Arguments

`x`	a `progenyClust` object.
`data`	the full or a subset of the oringal data matrix that was used for clustering. If unspecified, the function will plot stability scores for cluster number selection; If specified, the function will plot the data in scatter plots with colors annotated by clustering memberships (Please see details below).
`k`	integer specifying the cluster number for visualizing the clustering results of original data: only takes into effect when argument `data` is provided, and needs to be a cluster number that was previously investigated in `progenyClust` to generate the `progenyClust` object `x`. The default is the optimal number of clusters.
`errorbar`	logical flag: specifies whether the error bars should be drawn. The error bars can only be drawn when progeny clustering is repeated multiple times, i.e. input argument "repeats" in function progenyClust is greater than 1.
`xlab`	character string specifying the name of the x axis.
`ylab`	character string specifying the name of the y axis.
`...`	additional graphical arguments in `plot`(...).

Details

The plot function provides two types of visualization: (1) visualizing stability scores, and (2) visualizing clustering results. To visualize the stability scores that are output from progenyClust function, please run the plot function without specifying the input argument data. The resulting plot visualizes the stability score at each cluster number. This plot can provide an overview of clustering stability, and can facilitate selecting the optimal cluster number.

The plot function can also visualize the clustering results in scatter plots by specifying the input argument data. Since the goal is to view how the original data is clustered with certain cluster number, data needs to contain exactly the same number of samples as in the original data that was used to run the progenyClust function. If data contains more than two features, a table of scatter plots will be created to show clustering results within each pair of dimensions. data with more than 20 features/columns will not be accepted, but a subset of data with selected features can be used in this case. The input argument k specifies the cluster number at which the clustering result is shown. Note that k needs to be a cluster number that was previously examined by progenyClust when generating the progenyClust object x. If k is not provided, the function will use the optimal cluster number determined by the Gap criterion only if method='gap', and will use the optimal number determined by the Score criterion if method='gap' or method='both' when running progenyClust.

Value

returns plots as described in Details.

Author(s)

C.W. Hu, Rice University

References

Hu, C.W., et al. "Progeny Clustering: A Method to Identify Biological Phenotypes." Scientific reports 5 (2015).
http://www.nature.com/articles/srep12894

Examples

# a 3-cluster 2-dimensional example dataset
data('test')

# default progeny clsutering
progenyClust(test,ncluster=2:5)->pc

# plot the scores to select the optimal cluster number
plot(pc)

# plot the clustering results with the optimal cluster number
plot(pc,test)

[Package progenyClust version 1.2 Index]