svaba {bapred}R Documentation

Batch effect adjustment using SVA

Description

Performs batch effect adjustment using Surrogate Variable Analysis (SVA) and additionally returns information necessary for addon batch effect adjustment with frozen SVA.

Usage

svaba(x, y, batch, nbf = NULL, algorithm = "fast")

Arguments

x

matrix. The covariate matrix. Observations in rows, variables in columns.

y

factor. Binary target variable. Has to have two factor levels, where each of them correponds to one of the two classes of the target variable.

batch

factor. Batch variable. Each factor level (or 'category') corresponds to one of the batches. For example, if there are four batches, this variable would have four factor levels and observations with the same factor level would belong to the same batch.

nbf

integer. Number of latent factors to estimate.

algorithm

character. If method = "fast" the "approximate fSVA algorithm" will be used in frozen SVA. If method = "exact" the "exact fSVA algorithm" will be used. See Parker et al. (2014).

Details

This is essentially a wrapper function of the function sva() from the Bioconductor package of the same name.

Value

svaba returns an object of class svatrain. An object of class "svatrain" is a list containing the following components:

xadj

matrix of adjusted (training) data

xtrain

the unadjusted covariate matrix. Used in frozen SVA.

ytrain

binary target variable. Used in frozen SVA.

svobj

output of the function sva(). Used in frozen SVA.

algorithm

algorithm to use in frozen SVA

nbatches

number of batches

batch

batch variable

Author(s)

Roman Hornung

References

Leek, J. T., Storey, J. D. (2007). Capturing Heterogeneity in Gene Expression Studies by Surrogate Variable Analysis. PLoS Genetics 3:1724-1735, <doi: 10.1371/journal.pgen.0030161>.

Parker, H. S., Bravo, H. C., Leek, J. T. (2014). Removing batch effects for prediction problems with frozen surrogate variable analysis. PeerJ 2:e561, <doi: 10.7717/peerj.561>.

Examples

data(autism)

# Random subset of 150 variables:
set.seed(1234)
Xsub <- X[,sample(1:ncol(X), size=150)]

# In cases of batches with more than 20 observations
# select 20 observations at random:
subinds <- unlist(sapply(1:length(levels(batch)), function(x) {
  indbatch <- which(batch==x)
  if(length(indbatch) > 20)
    indbatch <- sort(sample(indbatch, size=20))
  indbatch
}))
Xsub <- Xsub[subinds,]
batchsub <- batch[subinds]
ysub <- y[subinds]



params <- svaba(x=Xsub, y=ysub, batch=batchsub)

[Package bapred version 1.1 Index]