robloxbioc {RobLoxBioC} | R Documentation |
Generic Function for Preprocessing Biological Data
Description
Generic function for preprocessing biological data using optimally robust (rmx) estimators; confer Rieder (1994), Kohl (2005), Rieder et al (2008).
Usage
robloxbioc(x, ...)
## S4 method for signature 'matrix'
robloxbioc(x, eps = NULL, eps.lower = 0, eps.upper = 0.05, steps = 3L,
fsCor = TRUE, mad0 = 1e-4)
## S4 method for signature 'AffyBatch'
robloxbioc(x, bg.correct = TRUE, pmcorrect = TRUE, normalize = FALSE,
add.constant = 32, verbose = TRUE, eps = NULL,
eps.lower = 0, eps.upper = 0.05, steps = 3L, fsCor = TRUE,
mad0 = 1e-4, contrast.tau = 0.03, scale.tau = 10,
delta = 2^(-20), sc = 500)
## S4 method for signature 'beadLevelData'
robloxbioc(x, channelList = list(greenChannel), probeIDs = NULL,
useSampleFac = FALSE, sampleFac = NULL, weightNames = "wts",
removeUnMappedProbes = TRUE, eps = NULL, eps.lower = 0,
eps.upper = 0.05, steps = 3L, fsCor = TRUE, mad0 = 1e-4)
Arguments
x |
biological data. |
... |
additional parameters. |
eps |
positive real (0 < |
eps.lower |
positive real (0 <= |
eps.upper |
positive real ( |
steps |
positive integer. k-step is used to compute the optimally robust estimator. |
fsCor |
logical: perform finite-sample correction. See function |
mad0 |
scale estimate used if computed MAD is equal to zero |
bg.correct |
if |
pmcorrect |
method used for PM correction; |
normalize |
logical: if |
add.constant |
constant added to the MAS 5.0 expression values before the normalization step. Improves the variance of the measure one no longer devides by numbers close to 0 when computing fold-changes. |
verbose |
logical: if |
contrast.tau |
a number denoting the contrast tau parameter; confer the MAS 5.0 PM correction algorithm. |
scale.tau |
a number denoting the scale tau parameter; confer the MAS 5.0 PM correction algorithm. |
delta |
a number denoting the delta parameter; confer the MAS 5.0 PM correction algorithm. |
sc |
value at which all arrays will be scaled to. |
channelList |
List of objects of class illuminaChannel that defines the
summarisation to be performed where in our setup only the slots |
probeIDs |
Vector of ArrayAddressIDs to be included in the summarized object. The default is to summarize all probes. |
useSampleFac |
if |
sampleFac |
optional character vector giving which a sample identifer for each section |
weightNames |
name of column in the |
removeUnMappedProbes |
if TRUE and annotation information is stored in the
|
Details
The optimally-robust resp. the radius-minimax (rmx) estimator for normal location and scale is used to preprocess biological data. The computation uses a k-step construction with median and MAD as starting estimators; cf. Rieder (1994) and Kohl (2005).
If the amount of gross errors (contamination) is known, it can be
specified by eps
. The radius of the corresponding infinitesimal
contamination neighborhood (infinitesimal version of Tukey's gross error model)
is obtained by multiplying eps
by the square root of the sample size.
If the amount of gross errors (contamination) is unknown, which is typically
the case, try to find a rough estimate for the amount of gross errors, such that
it lies between eps.lower
and eps.upper
.
If eps
is NULL
, the radius-minimax (rmx) estimator in sense of
Rieder et al. (2001, 2008), respectively Section 2.2 of Kohl (2005) is used.
The algorithm used for Affymetrix data is similar to MAS 5.0 (cf. Affymetrix (2002)). The main difference is the substitution of the Tukey one-step estimator by our rmx k-step (k >= 1) estimator in the PM/MM correction step. The optional scale normalization is performed as given in Affymetrix (2002).
In case of Illumina data, the rmx estimator is used to summarize the bead types.
The implementation for the most part copies summarize
from beadarray.
For sample size <= 2, median and MAD are used for estimation.
If eps = 0
, mean and sd are computed.
Value
Return value depends on the class of x
.
In case of "matrix"
a matrix with columns "mean" and "sd" is returned.
In case of "AffyBatch"
an object of class "ExpressionSet"
is returned.
In case of "BeadLevelData"
an object of class "ExpressionSetIllumina"
is returned.
Author(s)
Matthias Kohl Matthias.Kohl@stamats.de,
update for beadarray versions >= 2.0.0 with support by Mark Dunnings and Andy Lynch
References
Affymetrix, Inc. (2002). Statistical Algorithms Description Document. Affymetrix, Santa Clara.
Kohl, M. (2005) Numerical Contributions to the Asymptotic Theory of Robustness. Bayreuth: Dissertation.
Kohl M. and Deigner H.P. (2010). Preprocessing of gene expression data by optimally robust estimators. BMC Bioinformatics, 11:583.
M. Kohl, P. Ruckdeschel, and H. Rieder (2010). Infinitesimally Robust Estimation in General Smoothly Parametrized Models. Statistical Methods and Application, 19(3):333-354.
Rieder, H. (1994) Robust Asymptotic Statistics. New York: Springer.
Rieder, H., Kohl, M. and Ruckdeschel, P. (2008) The Costs of not Knowing the Radius. Statistical Methods and Applications 17(1) 13-40. Extended version: http://r-kurs.de/RRlong.pdf
See Also
roblox
, rowRoblox
,
AffyBatch-class
,
generateExprVal.method.mas
,
ExpressionSet-class
,
summarize
Examples
set.seed(123) # to have reproducible results for package checking
## similar to rowRoblox of package RobLox
ind <- rbinom(200, size=1, prob=0.05)
X <- matrix(rnorm(200, mean=ind*3, sd=(1-ind) + ind*9), nrow = 2)
robloxbioc(X)
robloxbioc(X, steps = 5)
robloxbioc(X, eps = 0.05)
robloxbioc(X, eps = 0.05, steps = 5)
## \donttest to reduce check time
## the function is designed for large scale problems
X <- matrix(rnorm(50000*20, mean = 1), nrow = 50000)
system.time(robloxbioc(X))
## using Affymetrix data
## confer example to generateExprVal.method.mas
## A more worked out example can be found in the scripts folder
## of the package.
data(SpikeIn)
probes <- pm(SpikeIn)
mas <- generateExprVal.method.mas(probes)
rl <- 2^robloxbioc(log2(t(probes)))
concentrations <- as.numeric(colnames(SpikeIn))
plot(concentrations, mas$exprs, log="xy", ylim=c(50,10000), type="b",
ylab = "expression measures")
points(concentrations, rl[,1], pch = 20, col="orange", type="b")
legend("topleft", c("MAS", "roblox"), pch = c(1, 20))
## Affymetrix dilution data
library(affydata)
data(Dilution)
eset <- robloxbioc(Dilution)
## Affymetrix scale normalization
eset1 <- robloxbioc(Dilution, normalize = TRUE)
## Illumina bead level data
library(beadarrayExampleData)
data(exampleBLData)
res <- robloxbioc(exampleBLData, eps.upper = 0.5)
res