lv2Transformer {countTransformers}R Documentation

Log and VOOM Based Count Transformation Minimizing Sum of Sample-Specific Squared Difference

Description

Log and VOOM based count transformation minimizing sum of sample-specific squared difference.

Usage

lv2Transformer(mat, lib.size = NULL, low = 0.001, upp = 1000)

Arguments

mat

G x n data matrix, where G is the number of genes and n is the number of subjects

lib.size

By default, lib.size is a vector of column sums of mat

low

lower bound for the model parameter

upp

upper bound for the model parameter

Details

Denote x_{gi} as the expression level of the g-th gene for the i-th subject. We perform the log transformation

y_{gi}=\log_2\left(t_{gi} + \frac{1}{\delta}\right)

, where

t_{gi}=\frac{\left(x_{gi}+0.5\right)}{X_i+1}\times 10^6

and X_i=\sum_{g=1}^{G} x_{gi} is the column sum for the i-th column of the matrix mat. The optimal value for the parameter \delta is to minimize the sum of the squared difference between the sample mean and the sample median across n subjects

\sum_{i=1}^{n}\left(\bar{y}_i - \tilde{y}_i\right)^2

, \bar{y}_i=\sum_{g=1}^{G}y_{gi}/G and \tilde{y}_i is the median of y_{1i}, \ldots, y_{Gi}, and where G is the number of genes and n is the number of subjects.

Value

A list with 3 elements:

res.delta

An object returned by optimize function

delta

model parameter

mat2

transformed data matrix having the same dimension as mat

Author(s)

Zeyu Zhang, Danyang Yu, Minseok Seo, Craig P. Hersh, Scott T. Weiss, Weiliang Qiu

References

Zhang Z, Yu D, Seo M, Hersh CP, Weiss ST, Qiu W. Novel Data Transformations for RNA-seq Differential Expression Analysis. (2019) 9:4820 https://rdcu.be/brDe5

Examples

library(Biobase)

data(es)
print(es)

# expression set
ex = exprs(es)
print(dim(ex))
print(ex[1:3,1:2])

# mean-median before transformation
vec = c(ex)
m = mean(vec)
md = median(vec)
diff = m - md
cat("m=", m, ", md=", md, ", diff=", diff, "\n")

res = lv2Transformer(mat = ex)

# estimated model parameter
print(res$delta)

# mean-median after transformation
vec2 = c(res$mat2)
m2 = mean(vec2)
md2 = median(vec2)
diff2 = m2 - md2
cat("m2=", m2, ", md2=", md2, ", diff2=", diff2, "\n")

[Package countTransformers version 0.0.6 Index]