| bioCond {MAnorm2} | R Documentation |
Create a bioCond Object to Group ChIP-seq Samples
Description
bioCond creates an object which represents a biological condition,
given a set of ChIP-seq samples belonging to the condition. Such objects,
once created, can be supplied to fitMeanVarCurve to fit the
mean-variance trend, and subsequently to
diffTest for calling differential
ChIP-seq signals between two conditions.
Usage
bioCond(
norm.signal,
occupancy = NULL,
occupy.num = 1,
name = "NA",
weight = NULL,
strMatrix = NULL,
meta.info = NULL
)
Arguments
norm.signal |
A matrix or data frame of normalized signal intensities, where each row should represent a genomic interval and each column a sample. |
occupancy |
A matrix or data frame of logical values with the same
dimension as of |
occupy.num |
For each interval, the minimum number of samples occupying it required for the interval to be considered as occupied by the biological condition (see also "Details"). |
name |
A character scalar specifying the name of the biological condition. Used only for demonstration. |
weight |
A matrix or data frame specifying the relative precisions of
signal intensities in |
strMatrix |
An optional list of symmetric matrices specifying directly
the structure matrix of each genomic interval. Elements of it are
recycled if necessary.
This argument, if set, overrides the |
meta.info |
Optional extra information (e.g., genomic coordinates
of intervals). If set, the supplied argument is stored in the
|
Details
To call this function, one typically needs to first perform an MA
normalization on raw read counts of ChIP-seq samples by using
normalize.
The function will assign an indicator to each genomic interval (stored in
the occupancy field of the returned object; see also "Value"),
marking if the interval is occupied by this biological condition.
The argument occupy.num controls the minimum number of samples that
occupy an interval required for the interval to be determined as occupied by
the condition. Notably, the occupancy states of genomic intervals may matter
when fitting a mean-variance curve, as one may choose to use only the
occupied intervals to fit the curve (see also
fitMeanVarCurve).
For signal intensities of each genomic interval, weight specifies
their relative precisions corresponding to different ChIP-seq samples in
norm.signal. Intrinsically, the weights will be used to construct the
structure matrices of the created bioCond. Alternatively, one
can specify strMatrix directly when calling the function. To be
noted, MAnorm2 uses a structure matrix to model the relative variances of
signal intensities of a genomic interval as well as the correlations among
them, by considering them to be associated with a covariance matrix
proportional to the structure matrix. See setWeight for
a detailed description of structure matrix.
Value
bioCond returns an object of class
"bioCond", representing the biological condition to which the
supplied ChIP-seq samples belong.
In detail, an object of class "bioCond" is a list containing at
least the following fields:
nameName of the biological condition.
norm.signalA matrix of normalized signal intensities of ChIP-seq samples belonging to the condition.
occupancyA logical vector marking the occupancy status of each genomic interval.
meta.infoThe
meta.infoargument (only present when it is supplied).strMatrixStructure matrices associated with the genomic intervals.
sample.meanA vector of observed mean signal intensities of genomic intervals.
sample.varA vector recording the observed variance of signal intensities of each genomic interval.
Note that the sample.mean and sample.var fields
are calculated by applying the
GLS (generalized least squares) estimation to the signal intensities of
each genomic interval, considering them as having
a common mean and a covariance matrix proportional to the corresponding
structure matrix. Specifically, the sample.var field times the
corresponding structure matrices gives an unbiased estimate of the
covariance matrix associated with each interval (see
setWeight for details).
Besides, a fit.info field will be added to bioCond objects
once you have fitted a mean-variance curve for them (see
fitMeanVarCurve for details).
There are also other fields used internally for fitting the mean-variance trend and calling differential intervals between conditions. These fields should never be modified directly.
Warning
Among all the fields contained in a bioCond object,
only name and meta.info are subject to free modifications;
The strMatrix field must be modified through
setWeight.
References
Tu, S., et al., MAnorm2 for quantitatively comparing groups of ChIP-seq samples. Genome Res, 2021. 31(1): p. 131-145.
See Also
normalize for performing an MA normalization on
ChIP-seq samples; normalizeBySizeFactors for normalizing
ChIP-seq samples based on their size factors; setWeight
for modifying the structure matrices of a bioCond object.
normBioCond for performing an MA normalization on
bioCond objects; normBioCondBySizeFactors for
normalizing bioCond objects based on their size factors;
cmbBioCond for combining a set of bioCond
objects into a single one; MAplot.bioCond for creating
an MA plot on two bioCond objects; summary.bioCond
for summarizing a bioCond.
fitMeanVarCurve for modeling
the mean-variance dependence across intervals in bioCond objects;
diffTest for comparing two
bioCond objects; aovBioCond for comparing multiple
bioCond objects; varTestBioCond for calling
hypervariable and invariant intervals across ChIP-seq samples contained
in a bioCond.
Examples
data(H3K27Ac, package = "MAnorm2")
attr(H3K27Ac, "metaInfo")
## Construct a bioCond object for the GM12891 cell line.
# Apply MA normalization to the ChIP-seq samples of GM12891.
norm <- normalize(H3K27Ac, 5:6, 10:11)
# Call the constructor and optionally attach some meta information to the
# resulting bioCond, such as the coordinates of genomic intervals.
GM12891 <- bioCond(norm[5:6], norm[10:11], name = "GM12891",
meta.info = norm[1:3])
# Alternatively, you may assign different weights to the replicate samples
# for estimating the mean signal intensities of genomic intervals in this
# cell line. Here the weight of the 2nd replicate is reduced to half the
# weight of the 1st one.
GM12891_2 <- bioCond(norm[5:6], norm[10:11], name = "GM12891",
weight = c(1, 0.5))
# Equivalently, you can achieve the same effect by setting the strMatrix
# parameter.
GM12891_3 <- bioCond(norm[5:6], norm[10:11], name = "GM12891",
strMatrix = list(diag(c(1, 2))))