rv2Transformer {countTransformers}R Documentation

Root and VOOM Based Count Transformation Minimizing Sum of Sample-Specific Squared Difference

Description

Root and VOOM based count transformation minimizing sum of sample-specific squared difference.

Usage

rv2Transformer(mat, low = 1e-04, upp = 1000, lib.size = NULL)

Arguments

mat

G x n data matrix, where G is the number of genes and n is the number of subjects

lib.size

By default, lib.size is a vector of column sums of mat

low

lower bound for the model parameter

upp

upper bound for the model parameter

Details

Denote x_{gi} as the expression level of the g-th gene for the i-th subject. We perform the root and voom transformation

y_{gi}=\frac{t_{gi}^{(1/\eta)}}{(1/\eta)}

, where

t_{gi}=\frac{\left(x_{gi}+0.5\right)}{X_i+1}\times 10^6

and X_i=\sum_{g=1}^{G} x_{gi} is the column sum for the i-th column of the matrix mat. The optimal value for the parameter \eta is to minimize the sum of the squared difference between the sample mean and the sample median across n subjects

\sum_{i=1}^{n}\left(\bar{y}_i - \tilde{y}_i\right)^2

, \bar{y}_i=\sum_{g=1}^{G}y_{gi}/G and \tilde{y}_i is the median of y_{1i}, \ldots, y_{Gi}, and where G is the number of genes and n is the number of subjects.

Value

A list with 3 elements:

res.delta

An object returned by optimize function

eta

model parameter

mat2

transformed data matrix having the same dimension as mat

Author(s)

Zeyu Zhang, Danyang Yu, Minseok Seo, Craig P. Hersh, Scott T. Weiss, Weiliang Qiu

References

Zhang Z, Yu D, Seo M, Hersh CP, Weiss ST, Qiu W. Novel Data Transformations for RNA-seq Differential Expression Analysis. (2019) 9:4820 https://rdcu.be/brDe5

Examples

library(Biobase)

data(es)
print(es)

# expression set
ex = exprs(es)
print(dim(ex))
print(ex[1:3,1:2])

# mean-median before transformation
vec = c(ex)
m = mean(vec)
md = median(vec)
diff = m - md
cat("m=", m, ", md=", md, ", diff=", diff, "\n")

res = rv2Transformer(mat = ex)

# estimated model parameter
print(res$eta)

# mean-median after transformation
vec2 = c(res$mat2)
m2 = mean(vec2)
md2 = median(vec2)
diff2 = m2 - md2
cat("m2=", m2, ", md2=", md2, ", diff2=", diff2, "\n")

[Package countTransformers version 0.0.6 Index]