testRobustToNAimputation {wrProteo} | R Documentation |
Pair-wise testing robust to NA-imputation
Description
This function replaces NA
values based on group neighbours (based on grouping of columns in argument gr
), following overall assumption of close to Gaussian distribution.
Furthermore, it is assumed that NA
-values originate from experimental settings where measurements at or below detection limit are recoreded as NA
.
In such cases (eg in proteomics) it is current practice to replace NA
-values by very low (random) values in order to be able to perform t-tests.
However, random normal values used for replacing may in rare cases deviate from the average (the 'assumed' value) and in particular, if multiple NA
replacements are above the average,
may look like induced biological data and be misinterpreted as so.
The statistical testing uses eBayes
from Bioconductor package limma for robust testing in the context of small numbers of replicates.
By repeating multiple times the process of replacing NA
-values and subsequent testing the results can be sumarized afterwards by median over all repeated runs to remmove the stochastic effect of individual NA-imputation.
Thus, one may gain stability towards random-character of NA
imputations by repeating imputation & test 'nLoop' times and summarize p-values by median (results stabilized at 50-100 rounds).
It is necessary to define all groups of replicates in gr
to obtain all possible pair-wise testing (multiple columns in $BH, $lfdr etc).
The modified testing-procedure of Bioconductor package ROTS may optionaly be included, if desired.
This function returns a limma-like S3 list-object further enriched by additional fields/elements.
Usage
testRobustToNAimputation(
dat,
gr = NULL,
annot = NULL,
retnNA = TRUE,
avSd = c(0.15, 0.5),
avSdH = NULL,
plotHist = FALSE,
xLab = NULL,
tit = NULL,
imputMethod = "mode2",
seedNo = NULL,
multCorMeth = NULL,
nLoop = 100,
lfdrInclude = NULL,
ROTSn = NULL,
silent = FALSE,
debug = FALSE,
callFrom = NULL
)
Arguments
dat |
(matrix or data.frame) main data (may contain |
gr |
(character or factor) replicate association; if |
annot |
(matrix or data.frame) annotation (lines must match lines of data !), if |
retnNA |
(logical) retain and report number of |
avSd |
(numerical,length=2) population characteristics (mean and sd) for >1 |
avSdH |
depreciated, please use |
plotHist |
(logical) additional histogram of original, imputed and resultant distribution (made using |
xLab |
(character) custom x-axis label |
tit |
(character) custom title |
imputMethod |
(character) choose the imputation method (may be 'mode2'(default), 'mode1', 'datQuant', 'modeAdopt', 'informed' or 'none', for details see |
seedNo |
(integer) seed-value for normal random values |
multCorMeth |
(character) define which method(s) for correction of multipl testing should be run (for choice : 'BH','lfdr','BY','tValTab', choosing several is possible) |
nLoop |
(integer) number of runs of independent |
lfdrInclude |
(logical) depreciated, please used |
ROTSn |
(integer) depreciated, please used |
silent |
(logical) suppress messages |
debug |
(logical) additional messages for debugging |
callFrom |
(character) This function allows easier tracking of messages produced |
Details
The argument multCorMeth
allows to choose which multiple correction algorimths will be used and included to the final results.
Possible options are 'lfdr','BH','BY','tValTab', ROTSn='100' (name to element necessary) or 'noLimma' (to add initial p.values and BH to limma-results). By default 'lfdr' (local false discovery rate from package 'fdrtools') and 'BH' (Benjamini-Hochberg FDR) are chosen.
The option 'BY' referrs to Benjamini-Yakuteli FDR, 'tValTab' allows exporting all individual t-values from the repeated NA-substitution and subsequent testing.
This function is compatible with automatic extraction of experimental setup based on sdrf or other quantitation-specific sample annotation.
In this case, the results of automated importing and mining of sample annotation should be stored as $sampleSetup$groups
or $sampleSetup$lev
For details 'on choice of NA-impuation procedures with arguments 'imputMethod' and 'avSd' please see matrixNAneighbourImpute
.
Value
This function returns a limma-type S3 object of class 'MArrayLM' (which can be accessed lika a list); multiple results of testing or multiple testing correction types may get included ('p.value','FDR','BY','lfdr' or 'ROTS.BH')
See Also
NA-imputation via matrixNAneighbourImpute
, modereated t-test without NA-imputation moderTest2grp
, calculating lfdr pVal2lfdr
, eBayes
in Bioconductor package limma, t.test
,ROTS
of Bioconductor package ROTS
Examples
set.seed(2015); rand1 <- round(runif(600) +rnorm(600,1,2),3)
dat1 <- matrix(rand1,ncol=6) + matrix(rep((1:100)/20,6),ncol=6)
dat1[13:16,1:3] <- dat1[13:16,1:3] +2 # augment lines 13:16
dat1[19:20,1:3] <- dat1[19:20,1:3] +3 # augment lines 19:20
dat1[15:18,4:6] <- dat1[15:18,4:6] +1.4 # augment lines 15:18
dat1[dat1 <1] <- NA # mimick some NAs for low abundance
## normalize data
boxplot(dat1, main="data before normalization")
dat1 <- wrMisc::normalizeThis(as.matrix(dat1), meth="median")
## designate replicate relationships in samples ...
grp1 <- gl(2, 3, labels=LETTERS[1:2])
## moderated t-test with repeated inputations (may take >10 sec, >60 sec if ROTSn >0 !)
PLtestR1 <- testRobustToNAimputation(dat=dat1, gr=grp1, retnNA=TRUE, nLoop=70)
names(PLtestR1)