heat_tree {metacoder} | R Documentation |
Plot a taxonomic tree
Description
Plots the distribution of values associated with a taxonomic classification/heirarchy. Taxonomic classifications can have multiple roots, resulting in multiple trees on the same plot. A tree consists of elements, element properties, conditions, and mapping properties which are represented as parameters in the heat_tree object. The elements (e.g. nodes, edges, lables, and individual trees) are the infrastructure of the heat tree. The element properties (e.g. size and color) are characteristics that are manipulated by various data conditions and mapping properties. The element properties can be explicitly defined or automatically generated. The conditions are data (e.g. taxon statistics, such as abundance) represented in the taxmap/metacoder object. The mapping properties are parameters (e.g. transformations, range, interval, and layout) used to change the elements/element properties and how they are used to represent (or not represent) the various conditions.
Usage
heat_tree(...)
## S3 method for class 'Taxmap'
heat_tree(.input, ...)
## Default S3 method:
heat_tree(
taxon_id,
supertaxon_id,
node_label = NA,
edge_label = NA,
tree_label = NA,
node_size = 1,
edge_size = node_size,
node_label_size = node_size,
edge_label_size = edge_size,
tree_label_size = as.numeric(NA),
node_color = "#999999",
edge_color = node_color,
tree_color = NA,
node_label_color = "#000000",
edge_label_color = "#000000",
tree_label_color = "#000000",
node_size_trans = "area",
edge_size_trans = node_size_trans,
node_label_size_trans = node_size_trans,
edge_label_size_trans = edge_size_trans,
tree_label_size_trans = "area",
node_color_trans = "area",
edge_color_trans = node_color_trans,
tree_color_trans = "area",
node_label_color_trans = "area",
edge_label_color_trans = "area",
tree_label_color_trans = "area",
node_size_range = c(NA, NA),
edge_size_range = c(NA, NA),
node_label_size_range = c(NA, NA),
edge_label_size_range = c(NA, NA),
tree_label_size_range = c(NA, NA),
node_color_range = quantative_palette(),
edge_color_range = node_color_range,
tree_color_range = quantative_palette(),
node_label_color_range = quantative_palette(),
edge_label_color_range = quantative_palette(),
tree_label_color_range = quantative_palette(),
node_size_interval = range(node_size, na.rm = TRUE, finite = TRUE),
node_color_interval = NULL,
edge_size_interval = range(edge_size, na.rm = TRUE, finite = TRUE),
edge_color_interval = NULL,
node_label_max = 500,
edge_label_max = 500,
tree_label_max = 500,
overlap_avoidance = 1,
margin_size = c(0, 0, 0, 0),
layout = "reingold-tilford",
initial_layout = "fruchterman-reingold",
make_node_legend = TRUE,
make_edge_legend = TRUE,
title = NULL,
title_size = 0.08,
node_legend_title = "Nodes",
edge_legend_title = "Edges",
node_color_axis_label = NULL,
node_size_axis_label = NULL,
edge_color_axis_label = NULL,
edge_size_axis_label = NULL,
node_color_digits = 3,
node_size_digits = 3,
edge_color_digits = 3,
edge_size_digits = 3,
background_color = "#FFFFFF00",
output_file = NULL,
aspect_ratio = 1,
repel_labels = TRUE,
repel_force = 1,
repel_iter = 1000,
verbose = FALSE,
...
)
Arguments
... |
(other named arguments)
Passed to the |
.input |
An object of type |
taxon_id |
The unique ids of taxa. |
supertaxon_id |
The unique id of supertaxon |
node_label |
See details on labels. Default: no labels. |
edge_label |
See details on labels. Default: no labels. |
tree_label |
See details on labels. The label to display above each graph. The value of the root of each graph will be used. Default: None. |
node_size |
See details on size. Default: constant size. |
edge_size |
See details on size. Default: relative to node size. |
node_label_size |
See details on size. Default: relative to vertex size. |
edge_label_size |
See details on size. Default: relative to edge size. |
tree_label_size |
See details on size. Default: relative to graph size. |
node_color |
See details on colors. Default: grey. |
edge_color |
See details on colors. Default: same as node color. |
tree_color |
See details on colors. The value of the root of each graph will be used. Overwrites the node and edge color if specified. Default: Not used. |
node_label_color |
See details on colors. Default: black. |
edge_label_color |
See details on colors. Default: black. |
tree_label_color |
See details on colors. Default: black. |
node_size_trans |
See details on transformations.
Default: |
edge_size_trans |
See details on transformations.
Default: same as |
node_label_size_trans |
See details on transformations.
Default: same as |
edge_label_size_trans |
See details on transformations.
Default: same as |
tree_label_size_trans |
See details on transformations.
Default: |
node_color_trans |
See details on transformations.
Default: |
edge_color_trans |
See details on transformations. Default: same as node color transformation. |
tree_color_trans |
See details on transformations.
Default: |
node_label_color_trans |
See details on transformations.
Default: |
edge_label_color_trans |
See details on transformations.
Default: |
tree_label_color_trans |
See details on transformations.
Default: |
node_size_range |
See details on ranges. Default: Optimize to balance overlaps and range size. |
edge_size_range |
See details on ranges. Default: relative to node size range. |
node_label_size_range |
See details on ranges. Default: relative to node size. |
edge_label_size_range |
See details on ranges. Default: relative to edge size. |
tree_label_size_range |
See details on ranges. Default: relative to tree size. |
node_color_range |
See details on ranges. Default: Color-blind friendly palette. |
edge_color_range |
See details on ranges. Default: same as node color. |
tree_color_range |
See details on ranges. Default: Color-blind friendly palette. |
node_label_color_range |
See details on ranges. Default: Color-blind friendly palette. |
edge_label_color_range |
See details on ranges. Default: Color-blind friendly palette. |
tree_label_color_range |
See details on ranges. Default: Color-blind friendly palette. |
node_size_interval |
See details on intervals.
Default: The range of values in |
node_color_interval |
See details on intervals.
Default: The range of values in |
edge_size_interval |
See details on intervals.
Default: The range of values in |
edge_color_interval |
See details on intervals.
Default: The range of values in |
node_label_max |
The maximum number of node labels. Default: 20. |
edge_label_max |
The maximum number of edge labels. Default: 20. |
tree_label_max |
The maximum number of tree labels. Default: 20. |
overlap_avoidance |
( |
margin_size |
( |
layout |
The layout algorithm used to position nodes.
See details on layouts.
Default: |
initial_layout |
he layout algorithm used to set the initial position
of nodes, passed as input to the |
make_node_legend |
if TRUE, make legend for node size/color mappings. |
make_edge_legend |
if TRUE, make legend for edge size/color mappings. |
title |
Name to print above the graph. |
title_size |
The size of the title relative to the rest of the graph. |
node_legend_title |
The title of the legend for node data. Can be 'NA' or 'NULL' to remove the title. |
edge_legend_title |
The title of the legend for edge data. Can be 'NA' or 'NULL' to remove the title. |
node_color_axis_label |
The label on the scale axis corresponding to |
node_size_axis_label |
The label on the scale axis corresponding to |
edge_color_axis_label |
The label on the scale axis corresponding to |
edge_size_axis_label |
The label on the scale axis corresponding to |
node_color_digits |
The number of significant figures used for the numbers on the scale axis corresponding to |
node_size_digits |
The number of significant figures used for the numbers on the scale axis corresponding to |
edge_color_digits |
The number of significant figures used for the numbers on the scale axis corresponding to |
edge_size_digits |
The number of significant figures used for the numbers on the scale axis corresponding to |
background_color |
The background color of the plot. Default: Transparent |
output_file |
The path to one or more files to save the plot in using |
aspect_ratio |
The aspect_ratio of the plot. |
repel_labels |
If |
repel_force |
The force of which overlapping labels will be repelled from eachother. |
repel_iter |
The number of iterations used when repelling labels |
verbose |
If |
labels
The labels of nodes, edges, and trees can be added. Node labels are centered over their node. Edge labels are displayed over edges, in the same orientation. Tree labels are displayed over their tree.
Accepts a vector, the same length taxon_id
or a factor of its length.
sizes
The size of nodes, edges, labels, and trees can be mapped to various conditions. This is useful for displaying statistics for taxa, such as abundance. Only the relative size of the condition is used, not the values themselves. The <element>_size_trans (transformation) parameter can be used to make the size mapping non-linear. The <element>_size_range parameter can be used to proportionately change the size of an element based on the condition mapped to that element. The <element>_size_interval parameter can be used to change the limit at which a condition will be graphically represented as the same size as the minimum/maximum <element>_size_range.
Accepts a numeric
vector, the same length taxon_id
or a
factor of its length.
colors
The colors of nodes, edges, labels, and trees can be mapped to various conditions. This is useful for visually highlighting/clustering groups of taxa. Only the relative size of the condition is used, not the values themselves. The <element>_color_trans (transformation) parameter can be used to make the color mapping non-linear. The <element>_color_range parameter can be used to proportionately change the color of an element based on the condition mapped to that element. The <element>_color_interval parameter can be used to change the limit at which a condition will be graphically represented as the same color as the minimum/maximum <element>_color_range.
Accepts a vector, the same length taxon_id
or a factor of its length.
If a numeric vector is given, it is mapped to a color scale.
Hex values or color names can be used (e.g. #000000
or "black"
).
Mapping Properties
transformations
Before any conditions specified are mapped to an element property (color/size), they can be transformed to make the mapping non-linear. Any of the transformations listed below can be used by specifying their name. A customized function can also be supplied to do the transformation.
- "linear"
Proportional to radius/diameter of node
- "area"
circular area; better perceptual accuracy than
"linear"
- "log10"
Log base 10 of radius
- "log2"
Log base 2 of radius
- "ln"
Log base e of radius
- "log10 area"
Log base 10 of circular area
- "log2 area"
Log base 2 of circular area
- "ln area"
Log base e of circular area
ranges
The displayed range of colors and sizes can be explicitly defined or automatically generated.
When explicitly used, the size range will proportionately increase/decrease the size of a particular element.
Size ranges are specified by supplying a numeric
vector with two values: the minimum and maximum.
The units used should be between 0 and 1, representing the proportion of a dimension of the graph.
Since the dimensions of the graph are determined by layout, and not always square, the value
that 1
corresponds to is the square root of the graph area (i.e. the side of a square with
the same area as the plotted space).
Color ranges can be any number of color values as either HEX codes (e.g. #000000
) or
color names (e.g. "black"
).
layout
Layouts determine the position of node elements on the graph.
They are implemented using the igraph
package.
Any additional arguments passed to heat_tree
are passed to the igraph
function used.
The following character
values are understood:
- "automatic"
- "reingold-tilford"
Use
as_tree
. A circular tree-like layout.- "davidson-harel"
Use
with_dh
. A type of simulated annealing.- "gem"
Use
with_gem
. A force-directed layout.- "graphopt"
Use
with_graphopt
. A force-directed layout.- "mds"
Use
with_mds
. Multidimensional scaling.- "fruchterman-reingold"
Use
with_fr
. A force-directed layout.- "kamada-kawai"
Use
with_kk
. A layout based on a physical model of springs.- "large-graph"
Use
with_lgl
. Meant for larger graphs.- "drl"
Use
with_drl
. A force-directed layout.
intervals
This is the minimum and maximum of values displayed on the legend scales.
Intervals are specified by supplying a numeric
vector with two values: the minimum and maximum.
When explicitly used, the <element>_<property>_interval will redefine the way the actual conditional values are being represented
by setting a limit for the <element>_<property>.
Any condition below the minimum <element>_<property>_interval will be graphically represented the same as a condition AT the
minimum value in the full range of conditional values. Any value above the maximum <element>_<property>_interval will be graphically
represented the same as a value AT the maximum value in the full range of conditional values.
By default, the minimum and maximum equals the <element>_<property>_range used to infer the value of the <element>_<property>.
Setting a custom interval is useful for making <element>_<properties> in multiple graphs correspond to the same conditions,
or setting logical boundaries (such as c(0,1)
for proportions.
Note that this is different from the <element>_<property>_range mapping property, which determines the size/color of graphed elements.
Acknowledgements
This package includes code from the R package ggrepel to handle label overlap avoidance with permission from the author of ggrepel Kamil Slowikowski. We included the code instead of depending on ggrepel because we are using internal functions to ggrepel that might change in the future. We thank Kamil Slowikowski for letting us use his code and would like to acknowledge his implementation of the label overlap avoidance used in metacoder.
Examples
## Not run:
# Parse dataset for plotting
x = parse_tax_data(hmp_otus, class_cols = "lineage", class_sep = ";",
class_key = c(tax_rank = "taxon_rank", tax_name = "taxon_name"),
class_regex = "^(.+)__(.+)$")
# Default appearance:
# No parmeters are needed, but the default tree is not too useful
heat_tree(x)
# A good place to start:
# There will always be "taxon_names" and "n_obs" variables, so this is a
# good place to start. This will shown the number of OTUs in this case.
heat_tree(x, node_label = taxon_names, node_size = n_obs, node_color = n_obs)
# Plotting read depth:
# To plot read depth, you first need to add up the number of reads per taxon.
# The function `calc_taxon_abund` is good for this.
x$data$taxon_counts <- calc_taxon_abund(x, data = "tax_data")
x$data$taxon_counts$total <- rowSums(x$data$taxon_counts[, -1]) # -1 = taxon_id column
heat_tree(x, node_label = taxon_names, node_size = total, node_color = total)
# Plotting multiple variables:
# You can plot up to 4 quantative variables use node/edge size/color, but it
# is usually best to use 2 or 3. The plot below uses node size for number of
# OTUs and color for number of reads and edge size for number of samples
x$data$n_samples <- calc_n_samples(x, data = "taxon_counts")
heat_tree(x, node_label = taxon_names, node_size = n_obs, node_color = total,
edge_color = n_samples)
# Different layouts:
# You can use any layout implemented by igraph. You can also specify an
# initial layout to seed the main layout with.
heat_tree(x, node_label = taxon_names, node_size = n_obs, node_color = n_obs,
layout = "davidson-harel")
heat_tree(x, node_label = taxon_names, node_size = n_obs, node_color = n_obs,
layout = "davidson-harel", initial_layout = "reingold-tilford")
# Axis labels:
# You can add custom labeles to the legends
heat_tree(x, node_label = taxon_names, node_size = n_obs, node_color = total,
edge_color = n_samples, node_size_axis_label = "Number of OTUs",
node_color_axis_label = "Number of reads",
edge_color_axis_label = "Number of samples")
# Overlap avoidance:
# You can change how much node overlap avoidance is used.
heat_tree(x, node_label = taxon_names, node_size = n_obs, node_color = n_obs,
overlap_avoidance = .5)
# Label overlap avoidance
# You can modfiy how label scattering is handled using the `replel_force` and
`repel_iter` options. You can turn off label scattering using the `repel_labels` option.
heat_tree(x, node_label = taxon_names, node_size = n_obs, node_color = n_obs,
repel_force = 2, repel_iter = 20000)
heat_tree(x, node_label = taxon_names, node_size = n_obs, node_color = n_obs,
repel_labels = FALSE)
# Setting the size of graph elements:
# You can force nodes, edges, and lables to be a specific size/color range instead
# of letting the function optimize it. These options end in `_range`.
heat_tree(x, node_label = taxon_names, node_size = n_obs, node_color = n_obs,
node_size_range = c(0.01, .1))
heat_tree(x, node_label = taxon_names, node_size = n_obs, node_color = n_obs,
edge_color_range = c("black", "#FFFFFF"))
heat_tree(x, node_label = taxon_names, node_size = n_obs, node_color = n_obs,
node_label_size_range = c(0.02, 0.02))
# Setting the transformation used:
# You can change how raw statistics are converted to color/size using options
# ending in _trans.
heat_tree(x, node_label = taxon_names, node_size = n_obs, node_color = n_obs,
node_size_trans = "log10 area")
# Setting the interval displayed:
# By default, the whole range of the statistic provided will be displayed.
# You can set what range of values are displayed using options ending in `_interval`.
heat_tree(x, node_label = taxon_names, node_size = n_obs, node_color = n_obs,
node_size_interval = c(10, 100))
## End(Not run)