| trans_norm {microeco} | R Documentation |
Feature abundance normalization/transformation.
Description
Feature abundance normalization/transformation for a microtable object or data.frame object.
Methods
Public methods
Method new()
Get a transposed abundance table if the input is microtable object. In the table, rows are samples, and columns are features. This can make the further operations same with the traditional ecological methods.
Usage
trans_norm$new(dataset = NULL)
Arguments
datasetthe
microtableobject ordata.frameobject. If it isdata.frameobject, please make sure that rows are samples, and columns are features.
Returns
data_table, stored in the object.
Examples
library(microeco) data(dataset) t1 <- trans_norm$new(dataset = dataset)
Method norm()
Normalization/transformation methods.
Usage
trans_norm$norm( method = "rarefy", sample.size = NULL, rngseed = 123, replace = TRUE, pseudocount = 1, intersect.no = 10, ct.min = 1, condition = NULL, MARGIN = NULL, logbase = 2, ... )
Arguments
methoddefault "rarefy"; See the following available options.
Methods for normalization:-
"rarefy": classic rarefaction based on R sample function. -
"SRS": scaling with ranked subsampling method based on the SRS package provided by Lukas Beule and Petr Karlovsky (2020) <doi:10.7717/peerj.9593>. -
"clr": Centered log-ratio normalization <ISBN:978-0-412-28060-3> <doi: 10.3389/fmicb.2017.02224>. It is defined:clr_{ki} = \log\frac{x_{ki}}{g(x_i)}where
x_{ki}is the abundance ofkth feature in samplei,g(x_i)is the geometric mean of abundances for samplei. A pseudocount need to be added to deal with the zero. For more information, please see the 'clr' method indecostandfunction of vegan package. -
"rclr": Robust centered log-ratio normalization <doi: doi:10.1128/msystems.00016-19>. It is defined:rclr_{ki} = \log\frac{x_{ki}}{g(x_i > 0)}where
x_{ki}is the abundance ofkth feature in samplei,g(x_i > 0)is the geometric mean of abundances (> 0) for samplei. In rclr, zero values are kept as zeroes, and not taken into account. -
"GMPR": Geometric mean of pairwise ratios <doi: 10.7717/peerj.4600>. For a given samplei, the size factors_iis defined:s_i = \biggl( {\displaystyle\prod_{j=1}^{n} Median_{k|c_{ki}c_{kj} \ne 0} \lbrace \dfrac{c_{ki}}{c_{kj}} \rbrace} \biggr) ^{1/n}where
kdenotes all the features, andndenotes all the samples. For samplei,GMPR = \frac{x_{i}}{s_i}, wherex_iis the feature abundances of samplei. -
"CSS": Cumulative sum scaling normalization based on themetagenomeSeqpackage <doi:10.1038/nmeth.2658>. For a given samplej, the scaling factors_{j}^{l}is defined:s_{j}^{l} = {\displaystyle\sum_{i|c_{ij} \leqslant q_{j}^{l}} c_{ij}}where
q_{j}^{l}is thelth quantile of samplej, that is, in samplejthere arelfeatures with counts smaller thanq_{j}^{l}.c_{ij}denotes the count (abundance) of feature i in samplej. Forl= 0.95m(feature number),q_{j}^{l}corresponds to the 95th percentile of the count distribution for samplej. Normalized counts\tilde{c_{ij}} = (\frac{c_{ij}}{s_{j}^{l}})(N), whereNis an appropriately chosen normalization constant. -
"TSS": Total sum scaling. Abundance is divided by the sequencing depth. For a given samplej, normalized counts is defined:\tilde{c_{ij}} = \frac{c_{ij}}{\sum_{i=1}^{N_{j}} c_{ij}}where
c_{ij}is the counts of featureiin samplej, andN_{j}is the feature number of samplej. -
"eBay": Empirical Bayes approach to normalization <10.1186/s12859-020-03552-z>. The implemented method is not tree-related. In the output, the sum of each sample is 1. -
"TMM": Trimmed mean of M-values method based on thenormLibSizesfunction ofedgeRpackage <doi: 10.1186/gb-2010-11-3-r25>. -
"DESeq2": Median ratio of gene counts relative to geometric mean per gene based on the DESeq function ofDESeq2package <doi: 10.1186/s13059-014-0550-8>. This option can invoke thetrans_diffclass and extract the normalized data from the original result. Note that eithergrouporformulashould be provided. The scaling factor is defined:s_{j} = Median_{i} \frac{c_{ij}}{\bigl( {\prod_{j=1}^{n} c_{ij}} \bigr) ^{1/n}}where
c_{ij}is the counts of featureiin samplej, andnis the total sample number. -
"Wrench": Group-wise and sample-wise compositional bias factor <doi: 10.1186/s12864-018-5160-5>. Note that condition parameter is necesary to be passed toconditionparameter inwrenchfunction of Wrench package. As the input data must be microtable object, so the input condition parameter can be a column name ofsample_table. The scaling factor is defined:s_{j} = \frac{1}{p} \sum_{ij} W_{ij} \frac{X_{ij}}{\overline{X_{i}}}where
X_{ij}represents the relative abundance (proportion) for featureiin samplej,\overline{X_{i}}is the average proportion of featureiacross the dataset,W_{ij}represents a weight specific to each technique, andpis the feature number in sample. -
"RLE": Relative log expression.
Methods based on
decostandfunction of vegan package:-
"total": divide by margin total (default MARGIN = 1, i.e. rows - samples). -
"max": divide by margin maximum (default MARGIN = 2, i.e. columns - features). -
"normalize": make margin sum of squares equal to one (default MARGIN = 1). -
"range": standardize values into range 0...1 (default MARGIN = 2). If all values are constant, they will be transformed to 0. -
"standardize": scale x to zero mean and unit variance (default MARGIN = 2). -
"pa": scale x to presence/absence scale (0/1). -
"log": logarithmic transformation.
Other methods for transformation:
-
"AST": Arc sine square root transformation.
-
sample.sizedefault NULL; libray size for rarefaction when method = "rarefy" or "SRS". If not provided, use the minimum number across all samples. For "SRS" method, this parameter is passed to
Cminparameter ofSRSfunction of SRS package.rngseeddefault 123; random seed. Available when method = "rarefy" or "SRS".
replacedefault TRUE; see
samplefor the random sampling; Available whenmethod = "rarefy".pseudocountdefault 1; add pseudocount for those features with 0 abundance when
method = "clr".intersect.nodefault 10; the intersecting taxa number between paired sample for
method = "GMPR".ct.mindefault 1; the minimum number of counts required to calculate ratios for
method = "GMPR".conditiondefault NULL; Only available when
method = "Wrench". This parameter is passed to theconditionparameter ofwrenchfunction in Wrench package It must be a column name ofsample_tableor a vector with same length of samples.MARGINdefault NULL; 1 = samples, and 2 = features of abundance table; only available when method comes from
decostandfunction of vegan package. If MARGIN is NULL, use the default value in decostand function.logbasedefault 2; The logarithm base.
...parameters pass to
vegan::decostand, ormetagenomeSeq::cumNormwhen method = "CSS", oredgeR::normLibSizeswhen method = "TMM" or "RLE", ortrans_diffclass when method = "DESeq2", orwrenchfunction of Wrench package when method = "Wrench".
Returns
new microtable object or data.frame object.
Examples
newdataset <- t1$norm(method = "clr") newdataset <- t1$norm(method = "log")
Method clone()
The objects of this class are cloneable with this method.
Usage
trans_norm$clone(deep = FALSE)
Arguments
deepWhether to make a deep clone.
Examples
## ------------------------------------------------
## Method `trans_norm$new`
## ------------------------------------------------
library(microeco)
data(dataset)
t1 <- trans_norm$new(dataset = dataset)
## ------------------------------------------------
## Method `trans_norm$norm`
## ------------------------------------------------
newdataset <- t1$norm(method = "clr")
newdataset <- t1$norm(method = "log")