R: Dendrogram with according sample annotations

hca.plot {swamp}

R Documentation

Dendrogram with according sample annotations

Description

The function plots the dendrogram from hierarchical cluster analysis with colorcoded sample annotations below.

Usage

hca.plot(g, o, method = "correlation", link = "ward", colored = palette(), 
         border = NA, code = colnames(o), cex.code = 1, 
         breaks = round(nrow(oreihe)/4), 
         cutcolors = colorpanel(breaks, low = "green", mid = "black", high = "red"))

Arguments

`g`	the input data in form of a matrix with features as rows and samples as columns.
`o`	the corresponding sample annotations in the form of a data.frame. A single sample annotation variable as a vector is allowed and will be transformed to a data.frame. rownames (o) must be identical to colnames (g). o can contain factors and numeric variables. No character variables are allowed. NAs are allowed and blank spaces are plotted.
`method`	the distance method for the clustering. default="correlation". hcluster from the package amap is used and method must be one of "euclidean", "maximum", "manhattan", "canberra" "binary" "pearson", "correlation", "spearman" or "kendall".
`link`	the agglomeration principle for the clustering. default="ward". hcluster from the package amap is used and link must be one of "ward", "single", "complete", "average", "mcquitty", "median" or "centroid".
`colored`	a vector of colors in which factor variables of o will be colorcoded. default are the 8 colors of palette(). the first level is plotted in the first color, the second in the second color and so on. for annotation with more than 8 levels colors should be added here.
`border`	a color for the borders in the annotation rectangels rect(). default=NA.
`code`	vector containing names of the sample annotations. default=colnames(o).
`cex.code`	font size of code.
`breaks`	a number that determines in how many bins a numeric annotation is cut using the cut() function.
`cutcolors`	a vector of color in which numeric variables will be colored. length(cutcolors) has to be the number of breaks. a colorpanel is default to plot the numeric values as a color gradient, with low values in green and high values in red.

Details

The data is clustered using the amap package. The plot works for sample annotations as a data.frame or as a single vector. NAs are allowed in both data matrix and sample annotation data.frame. If the annotation is a factor, the annotations come in the colororder specified by colored. If the annotation is numeric, breaks and cutcolors is used which is currently set to be a colorpanel().

Note

requires the packages amap and gplots

Author(s)

Martin Lauss

Examples

# data as a matrix
set.seed(100)
g<-matrix(nrow=1000,ncol=50,rnorm(1000*50),dimnames=list(paste("Feature",1:1000),
   paste("Sample",1:50)))
g[1:100,26:50]<-g[1:100,26:50]+1 # the first 100 features show higher values in the samples 26:50
# patient annotations as a data.frame, annotations should be numbers and factor but not characters.
# rownames have to be the same as colnames of the data matrix 
set.seed(200)
o<-data.frame(Factor1=factor(c(rep("A",25),rep("B",25))),
              Factor2=factor(rep(c("A","B"),25)),
              Numeric1=rnorm(50),row.names=colnames(g))

## hca plot
hca.plot(g,o)

[Package swamp version 1.5.1 Index]