specificity_scores {POMS}R Documentation

Compute shrunken specificity score of a feature, which represents how the presence of a feature is associated with a given sample grouping.

Description

This code replicates the environmental specificity score introduced in phylogenize. The code here is modified from the phylogenize code base (https://bitbucket.org/pbradz/phylogenize/src/master/package/phylogenize/R/; commit 6f1bdba9c5a9ff04e90a8ad77bcee8ec9281730d).

Usage

specificity_scores(
  abun_table,
  meta_table,
  focal_var_level,
  var_colname,
  sample_colname,
  silence_citation = FALSE
)

Arguments

abun_table

abundance table to use for computing specificity Features must be rows and samples columns. All values greater than 0 will be interpreted as present.

meta_table

dataframe object containing metadata for all samples. Must include at least one column corresponding to the sample ids and one column containing the metadata of interest that will be focused on.

focal_var_level

length-one character vector specifying the variable value to restrict inferences of prevalence to. In other words, prevalence will be computed based on the sample set that contain this value of the variable of interest in the metadata table.

var_colname

length-one character vector specifying the name of column in the metadata table that contains the metadata of interest (e.g., where focal_var_level can be found).

sample_colname

length-one character vector specifying the name of column in the metadata table that contains the sample ids.

silence_citation

length-one Boolean vector specifying whether to silence message notifying user about phylogenize package and paper.

Details

This algorithm is descibed in detail in Bradley et al. 2018. Phylogeny-corrected identification of microbial gene families relevant to human gut colonization. PLOS Computational Biology.

Note thee can be some random fluctuations between re-runs of this function. The differences are usually minor, but users are strongly suggested to set a random seed before use to ensure their workflow is reproducible.

Value

Numeric vector with the specificity score for each input feature (i.e., for each row of abun_table).


[Package POMS version 1.0.1 Index]