R: Main function for computing contributional *beta* diversity

beta_div_contrib {FuncDiv}

R Documentation

Main function for computing contributional beta diversity

Description

Based on joint taxa-function input data (i.e., contributional data), the beta diversity (i.e., inter-sample distance or divergence) will be computed for the subset of taxa encoding each individual function separately. A large List object containing all these tables can be returned, or alternatively these tables will be written to the disk as plain-text files.

Usage

beta_div_contrib(
  metrics = NULL,
  func_tab = NULL,
  abun_tab = NULL,
  contrib_tab = NULL,
  in_tree = NULL,
  func_ids = NULL,
  return_objects = FALSE,
  write_outfiles = FALSE,
  outdir = NULL,
  ncores = 1,
  samp_colname = "sample",
  func_colname = "function.",
  taxon_colname = "taxon",
  abun_colname = "taxon_abun"
)

Arguments

`metrics`	beta diversity metrics to compute. Must be default metric computed by `parallelDist::parDist` or one of "weighted_unifrac", "unweighted_unifrac", or "jensen_shannon_div".
`func_tab`	data.frame object containing function copy numbers, with rows as functions and columns as taxa. Required if `abun_tab` is specified, and is mutually exclusive with `contrib_tab`.
`abun_tab`	data.frame object containing taxonomic abundances across samples, with rows as taxa and columns as samples. Required if `func_tab` is specified, and is mutually exclusive with `contrib_tab`.
`contrib_tab`	data.frame object containing combined taxa abundances and function copy numbers across taxa. Must contain columns corresponding to the sample ids, function ids, taxa ids, and taxa abundances within samples. These column names are specified by the `samp_colname`, `func_colname`, `taxon_colname`, and `abun_colname`, respectively.Mutually exclusive with `abun_tab` and `func_tab`.
`in_tree`	phylo object to use if `weighted_unifrac` or `unweighted_unifrac` are specified.
`func_ids`	character vector specifying subset of function ids to include for analysis. Will analyze all functions present if this is not specified.
`return_objects`	Boolean vector of length one, specifying whether function should return a list of all output distance tables (nested by metric name, and then by function id). Incompatible with `write_outfiles`.
`write_outfiles`	Boolean vector of length one, specifying whether function write all distance tables to plain-text files in the specified `outdir` location. Incompatible with `return_objects`.
`outdir`	character vector of length one, indicating where to save output files if `write_outfiles = TRUE`.
`ncores`	integer indicating number of cores to use for parallelizable steps.
`samp_colname`	sample id column name of `contrib_tab` input data.frame.
`func_colname`	function id column name of `contrib_tab` input data.frame.
`taxon_colname`	taxon id column name of `contrib_tab` input data.frame.
`abun_colname`	taxonomic abundance (within each sample) column name of `contrib_tab` input data.frame.

Details

Input data can be either a separate function copy number and taxonomic abundance table, or a joint contributional table. Metrics must be one of "weighted_unifrac", "unweighted_unifrac", "jensen_shannon_div", or a default metric available through the parallelDist::parDist function. See ?parallelDist::parDist for a description of all default metrics.

The taxonomic abundances will be converted to relative abundances prior to computing inter-sample distances.

Value

differs depending on the return_objects and write_outfiles parameters.

If return_objects = TRUE, then a nested List will be returned. Each specific beta diversity metric will be the first level, and the functions are the second level (e.g., contrib_beta$binary$func2).

If write_outfiles then a character vector will be returned, indicating where the output tables were written.

Examples

# First, simulate some (non-realistic) data.
set.seed(123)
test_tree <- ape::rtree(100)
test_abun <- data.frame(matrix(rnorm(500), nrow = 100, ncol = 5))
rownames(test_abun) <- test_tree$tip.label
colnames(test_abun) <- c("sample1", "sample2", "sample3", "sample4", "sample5")
test_abun[test_abun < 0] <- 0
test_func <- data.frame(matrix(sample(c(0L, 1L), 200, replace = TRUE),
                               nrow = 2, ncol = 100))
colnames(test_func) <- test_tree$tip.label
rownames(test_func) <- c("func1", "func2")

# Compute beta diversity, based on Weighted UniFrac and Jaccard distances
# (i.e., "binary").
contrib_beta <- beta_div_contrib(metrics = c("weighted_unifrac", "binary"),
                                 func_tab = test_func,
                                 abun_tab = test_abun,
                                 in_tree = test_tree,
                                 return_objects = TRUE,
                                 ncores = 1)

# Parse beta diversity distance list value for a specific function (func2) and
# distance metric (Jaccard).
contrib_beta$binary$func2

[Package FuncDiv version 1.0.0 Index]