normBioCondBySizeFactors {MAnorm2} | R Documentation |
Normalize bioCond
Objects by Their Size Factors
Description
Given a list of bioCond
objects,
normBioCondBySizeFactors
normalizes the signal intensities stored in
them based on their respective size factors, so that these bioCond
s
become comparable to each other. Note that the normalization method
implemented in this function is most suited to the bioCond
s comprised
of RNA-seq samples. See normBioCond
for a more robust method
for normalizing the bioCond
s consisting of ChIP-seq samples.
Usage
normBioCondBySizeFactors(conds, subset = NULL)
Arguments
conds |
A list of |
subset |
An optional vector specifying the subset of intervals or
genes to be used for estimating size factors.
Defaults to the intervals/genes occupied
by all the |
Details
Technically, normBioCondBySizeFactors
considers each
bioCond
object to be a single ChIP-seq/RNA-seq sample. It
treats the sample.mean
variable of each bioCond
as in the
scale of log2 read count, and applies the median ratio strategy to estimate
their respective size factors (see "References"). Finally, each
bioCond
object is normalized by subtracting its log2 size factor
from each of its samples.
The idea of normBioCondBySizeFactors
comes from the principle that
the more similar a set of samples are to each other, the fewer biases are
expected to introduce when normalizing them. With this function, instead of
performing an overall normalization on all the samples involved, you may
choose to first normalize the samples within each biological condition, and
then perform a normalization between the resulting bioCond
objects
(see "Examples" below).
Value
A list of bioCond
objects with normalized signal
intensities, corresponding to the argument conds
. To be noted,
information about the mean-variance dependence stored in the original
bioCond
objects, if any, will be removed from the returned
bioCond
s. You can re-fit a mean-variance curve for them by, for
example, calling fitMeanVarCurve
. Note also that the
original structure matrices are retained for each bioCond
in the
returned list (see setWeight
for a detailed description
of structure matrix).
Besides, an attribute named "size.factor"
is added to the
returned list, recording the size factor of each bioCond
object.
References
Anders, S. and W. Huber, Differential expression analysis for sequence count data. Genome Biol, 2010. 11(10): p. R106.
See Also
normalizeBySizeFactors
for normalizing
ChIP-seq/RNA-seq samples based on their size factors;
bioCond
for creating a bioCond
object;
normBioCond
for performing an MA normalization on
bioCond
objects; cmbBioCond
for combining a set of
bioCond
objects into a single one; MAplot.bioCond
for creating an MA plot on two normalized bioCond
objects;
fitMeanVarCurve
for modeling the mean-variance dependence
across intervals in bioCond
objects.
Examples
data(H3K27Ac, package = "MAnorm2")
attr(H3K27Ac, "metaInfo")
## First perform a normalization within each cell line, and then normalize
## across cell lines.
# Normalize samples separately for each cell line.
norm <- normalizeBySizeFactors(H3K27Ac, 4)
norm <- normalizeBySizeFactors(norm, 5:6,
subset = apply(norm[10:11], 1, all))
norm <- normalizeBySizeFactors(norm, 7:8,
subset = apply(norm[12:13], 1, all))
# Construct separately a bioCond object for each cell line, and normalize
# the resulting bioConds by their size factors.
conds <- list(GM12890 = bioCond(norm[4], norm[9], name = "GM12890"),
GM12891 = bioCond(norm[5:6], norm[10:11], name = "GM12891"),
GM12892 = bioCond(norm[7:8], norm[12:13], name = "GM12892"))
conds <- normBioCondBySizeFactors(conds)
# Inspect the normalization effects.
attr(conds, "size.factor")
MAplot(conds[[1]], conds[[2]], main = "GM12890 vs. GM12891")
abline(h = 0, lwd = 2, lty = 5)