bioCond {MAnorm2} | R Documentation |
Create a bioCond
Object to Group ChIP-seq Samples
Description
bioCond
creates an object which represents a biological condition,
given a set of ChIP-seq samples belonging to the condition. Such objects,
once created, can be supplied to fitMeanVarCurve
to fit the
mean-variance trend, and subsequently to
diffTest
for calling differential
ChIP-seq signals between two conditions.
Usage
bioCond(
norm.signal,
occupancy = NULL,
occupy.num = 1,
name = "NA",
weight = NULL,
strMatrix = NULL,
meta.info = NULL
)
Arguments
norm.signal |
A matrix or data frame of normalized signal intensities, where each row should represent a genomic interval and each column a sample. |
occupancy |
A matrix or data frame of logical values with the same
dimension as of |
occupy.num |
For each interval, the minimum number of samples occupying it required for the interval to be considered as occupied by the biological condition (see also "Details"). |
name |
A character scalar specifying the name of the biological condition. Used only for demonstration. |
weight |
A matrix or data frame specifying the relative precisions of
signal intensities in |
strMatrix |
An optional list of symmetric matrices specifying directly
the structure matrix of each genomic interval. Elements of it are
recycled if necessary.
This argument, if set, overrides the |
meta.info |
Optional extra information (e.g., genomic coordinates
of intervals). If set, the supplied argument is stored in the
|
Details
To call this function, one typically needs to first perform an MA
normalization on raw read counts of ChIP-seq samples by using
normalize
.
The function will assign an indicator to each genomic interval (stored in
the occupancy
field of the returned object; see also "Value"),
marking if the interval is occupied by this biological condition.
The argument occupy.num
controls the minimum number of samples that
occupy an interval required for the interval to be determined as occupied by
the condition. Notably, the occupancy states of genomic intervals may matter
when fitting a mean-variance curve, as one may choose to use only the
occupied intervals to fit the curve (see also
fitMeanVarCurve
).
For signal intensities of each genomic interval, weight
specifies
their relative precisions corresponding to different ChIP-seq samples in
norm.signal
. Intrinsically, the weights will be used to construct the
structure matrices of the created bioCond
. Alternatively, one
can specify strMatrix
directly when calling the function. To be
noted, MAnorm2 uses a structure matrix to model the relative variances of
signal intensities of a genomic interval as well as the correlations among
them, by considering them to be associated with a covariance matrix
proportional to the structure matrix. See setWeight
for
a detailed description of structure matrix.
Value
bioCond
returns an object of class
"bioCond"
, representing the biological condition to which the
supplied ChIP-seq samples belong.
In detail, an object of class "bioCond"
is a list containing at
least the following fields:
name
Name of the biological condition.
norm.signal
A matrix of normalized signal intensities of ChIP-seq samples belonging to the condition.
occupancy
A logical vector marking the occupancy status of each genomic interval.
meta.info
The
meta.info
argument (only present when it is supplied).strMatrix
Structure matrices associated with the genomic intervals.
sample.mean
A vector of observed mean signal intensities of genomic intervals.
sample.var
A vector recording the observed variance of signal intensities of each genomic interval.
Note that the sample.mean
and sample.var
fields
are calculated by applying the
GLS (generalized least squares) estimation to the signal intensities of
each genomic interval, considering them as having
a common mean and a covariance matrix proportional to the corresponding
structure matrix. Specifically, the sample.var
field times the
corresponding structure matrices gives an unbiased estimate of the
covariance matrix associated with each interval (see
setWeight
for details).
Besides, a fit.info
field will be added to bioCond
objects
once you have fitted a mean-variance curve for them (see
fitMeanVarCurve
for details).
There are also other fields used internally for fitting the mean-variance trend and calling differential intervals between conditions. These fields should never be modified directly.
Warning
Among all the fields contained in a bioCond
object,
only name
and meta.info
are subject to free modifications;
The strMatrix
field must be modified through
setWeight
.
References
Tu, S., et al., MAnorm2 for quantitatively comparing groups of ChIP-seq samples. Genome Res, 2021. 31(1): p. 131-145.
See Also
normalize
for performing an MA normalization on
ChIP-seq samples; normalizeBySizeFactors
for normalizing
ChIP-seq samples based on their size factors; setWeight
for modifying the structure matrices of a bioCond
object.
normBioCond
for performing an MA normalization on
bioCond
objects; normBioCondBySizeFactors
for
normalizing bioCond
objects based on their size factors;
cmbBioCond
for combining a set of bioCond
objects into a single one; MAplot.bioCond
for creating
an MA plot on two bioCond
objects; summary.bioCond
for summarizing a bioCond
.
fitMeanVarCurve
for modeling
the mean-variance dependence across intervals in bioCond
objects;
diffTest
for comparing two
bioCond
objects; aovBioCond
for comparing multiple
bioCond
objects; varTestBioCond
for calling
hypervariable and invariant intervals across ChIP-seq samples contained
in a bioCond
.
Examples
data(H3K27Ac, package = "MAnorm2")
attr(H3K27Ac, "metaInfo")
## Construct a bioCond object for the GM12891 cell line.
# Apply MA normalization to the ChIP-seq samples of GM12891.
norm <- normalize(H3K27Ac, 5:6, 10:11)
# Call the constructor and optionally attach some meta information to the
# resulting bioCond, such as the coordinates of genomic intervals.
GM12891 <- bioCond(norm[5:6], norm[10:11], name = "GM12891",
meta.info = norm[1:3])
# Alternatively, you may assign different weights to the replicate samples
# for estimating the mean signal intensities of genomic intervals in this
# cell line. Here the weight of the 2nd replicate is reduced to half the
# weight of the 1st one.
GM12891_2 <- bioCond(norm[5:6], norm[10:11], name = "GM12891",
weight = c(1, 0.5))
# Equivalently, you can achieve the same effect by setting the strMatrix
# parameter.
GM12891_3 <- bioCond(norm[5:6], norm[10:11], name = "GM12891",
strMatrix = list(diag(c(1, 2))))