baaddon {bapred} | R Documentation |
Addon batch effect adjustment
Description
Performs addon batch effect adjustment for a method of choice: takes the output of ba
or that of one of the functions performing a specific batch effect adjustment method (e.g. fabatch
or svaba
) and new batch data. Then performs the respective batch effect adjustment method on the new batch data.
Usage
baaddon(params, x, batch)
Arguments
params |
object of class |
x |
matrix. The covariate matrix of the new data. Observations in rows, variables in columns. |
batch |
factor. Batch variable of the new data. Currently has to have levels: '1', '2', '3' and so on. |
Value
The adjusted covariate matrix of the test data.
Note
The following methods are NOT recommended in cross-study prediction settings: FAbatch (fabatch
), frozen SVA (svaba
), standardization (standardize
) as well as no addon batch effect adjustment (noba
).
Given a not too small test set, the following methods are recommended (Hornung et al., 2016b): ComBat (combatba
), mean-centering (meancenter
), Ratio-A (ratioa
), Ratio-G (ratiog
).
Author(s)
Roman Hornung
References
Hornung, R., Boulesteix, A.-L., Causeur, D. (2016). Combining location-and-scale batch effect adjustment with data cleaning by latent factor adjustment. BMC Bioinformatics 17:27, <doi: 10.1186/s12859-015-0870-z>.
Hornung, R., Causeur, D., Bernau, C., Boulesteix, A.-L. (2017). Improving cross-study prediction through addon batch effect adjustment and addon normalization. Bioinformatics 33(3):397–404, <doi: 10.1093/bioinformatics/btw650>.
Johnson, W. E., Rabinovic, A., Li, C. (2007). Adjusting batch effects in microarray expression data using empirical bayes methods. Biostatistics 8:118-127, <doi: 10.1093/biostatistics/kxj037>.
Leek, J. T., Storey, J. D. (2007). Capturing Heterogeneity in Gene Expression Studies by Surrogate Variable Analysis. PLoS Genetics 3:1724-1735, <doi: 10.1371/journal.pgen.0030161>.
Luo, J., Schumacher, M., Scherer, A., Sanoudou, D., Megherbi, D., Davison, T., Shi, T., Tong, W., Shi, L., Hong, H., Zhao, C., Elloumi, F., Shi, W., Thomas, R., Lin, S., Tillinghast, G., Liu, G., Zhou, Y., Herman, D., Li, Y., Deng, Y., Fang, H., Bushel, P., Woods, M., Zhang, J. (2010). A comparison of batch effect removal methods for enhancement of prediction performance using maqc-ii microarray gene expression data. The Pharmacogenomics Journal 10:278-291, <doi: 10.1038/tpj.2010.57>.
Parker, H. S., Bravo, H. C., Leek, J. T. (2014). Removing batch effects for prediction problems with frozen surrogate variable analysis. PeerJ 2:e561, <doi: 10.7717/peerj.561>.
Examples
data(autism)
# Random subset of 150 variables:
set.seed(1234)
Xsub <- X[,sample(1:ncol(X), size=150)]
# In cases of batches with more than 20 observations
# select 20 observations at random:
subinds <- unlist(sapply(1:length(levels(batch)), function(x) {
indbatch <- which(batch==x)
if(length(indbatch) > 20)
indbatch <- sort(sample(indbatch, size=20))
indbatch
}))
Xsub <- Xsub[subinds,]
batchsub <- batch[subinds]
ysub <- y[subinds]
trainind <- which(batchsub %in% c(1,2))
Xsubtrain <- Xsub[trainind,]
ysubtrain <- ysub[trainind]
batchsubtrain <- factor(as.numeric(batchsub[trainind]), levels=c(1,2))
testind <- which(batchsub %in% c(3,4))
Xsubtest <- Xsub[testind,]
ysubtest <- ysub[testind]
batchsubtest <- as.numeric(batchsub[testind])
batchsubtest[batchsubtest==3] <- 1
batchsubtest[batchsubtest==4] <- 2
batchsubtest <- factor(batchsubtest, levels=c(1,2))
somemethods <- c("fabatch", "combat", "meancenter", "none")
adjustedtestdata <- list()
for(i in seq(along=somemethods)) {
cat(paste("Adjusting training data using method = \"", somemethods[i],
"\"", sep=""), "\n")
paramstemp <- ba(x=Xsubtrain, y=ysubtrain, batch=batchsubtrain,
method = somemethods[i])
cat(paste("Addon adjusting test data using method = \"",
somemethods[i], "\"", sep=""), "\n")
adjustedtestdata[[i]] <- baaddon(params=paramstemp, x=Xsubtest,
batch=batchsubtest)
}