fabatch {bapred} | R Documentation |
Batch effect adjustment using FAbatch
Description
Performs batch effect adjustment using the FAbatch-method described in Hornung et al. (2016) and additionally returns information necessary for addon batch effect adjustment with FAbatch.
Usage
fabatch(x, y, batch, nbf = NULL, minerr = 1e-06,
probcrossbatch = TRUE, maxiter = 100, maxnbf = 12)
Arguments
x |
matrix. The covariate matrix. Observations in rows, variables in columns. |
y |
factor. Binary target variable. Has to have two factor levels, where each of them correponds to one of the two classes of the target variable. |
batch |
factor. Batch variable. Each factor level (or 'category') corresponds to one of the batches. For example, if there are four batches, this variable would have four factor levels and observations with the same factor level would belong to the same batch. |
nbf |
integer. Number of factors to estimate in all batches. If not given the number of factors is estimated automatically for each batch. Recommended to leave unspecified. |
minerr |
numeric. Maximal mean quadratic deviations between the estimated residual variances from two consecutive iterations. The iteration stops when this value is undercut. |
probcrossbatch |
logical. Default is |
maxiter |
integer. Maximal number of iterations in the estimation of the latent factors by Maximum Likelihood. |
maxnbf |
integer. Maximal number of factors if |
Value
fabatch
returns an object of class fabatch
.
An object of class "fabatch
" is a list containing the following components:
xadj |
matrix of adjusted (training) data |
m1 |
means of the standardized variables in class '1' |
m2 |
means of the standardized variables in class '2' |
b0 |
intercept out of the L2-penalized logistic regression performed for estimation of the class probabilities |
b |
variable coefficients out of the L2-penalized logistic regression performed for estimation of the class probabilities |
pooledsds |
vector containing the pooled standard deviations of the variables |
meanoverall |
vector containing the variable means |
minerr |
maximal mean quadratic deviations between the estimated residual variances from two consecutive iterations |
nbfinput |
user-specified number of latent factors |
badvariables |
indices of those variables which are constant in at least one batch |
nbatches |
number of batches |
batch |
batch variable |
nbfvec |
vector containing the numbers of factors in the individual batches |
Author(s)
Roman Hornung
References
Hornung, R., Boulesteix, A.-L., Causeur, D. (2016). Combining location-and-scale batch effect adjustment with data cleaning by latent factor adjustment. BMC Bioinformatics 17:27, <doi: 10.1186/s12859-015-0870-z>.
Examples
data(autism)
# Random subset of 150 variables:
set.seed(1234)
Xsub <- X[,sample(1:ncol(X), size=150)]
# In cases of batches with more than 20 observations
# select 20 observations at random:
subinds <- unlist(sapply(1:length(levels(batch)), function(x) {
indbatch <- which(batch==x)
if(length(indbatch) > 20)
indbatch <- sort(sample(indbatch, size=20))
indbatch
}))
Xsub <- Xsub[subinds,]
batchsub <- batch[subinds]
ysub <- y[subinds]
fabatch(x=Xsub, y=ysub, batch=batchsub)