LPS.coeff {LPS} | R Documentation |
Linear Predictor Score coefficient computation
Description
As Linear Predictor Score coefficients are genuinely t statistics, this function provides a faster implementation for large datasets than using t.test
.
Usage
LPS.coeff(data, response, formula = ~1, type = c("t", "limma"),
p.value = TRUE, log = FALSE, weighted = FALSE, ...)
Arguments
data |
Continuous data used to retrieve classes, as a |
response |
Already known classes for the samples provided in |
formula |
A |
type |
Single character value, "t" to compute genuine t statistics (unequal variances and unpaired samples) or "limma" to use the lmFit() and eBayes() t statistics from this microarray oriented Bioconductor package. |
p.value |
Single logical value, whether to compute (two-sided) p-values or not. |
log |
Single logical value, whether to log-transform t or not (sign will be preserved). Original description of the LPS does not include log-transformation, but it may be useful to not over-weight discriminant genes in large series. Values between -1 and 1 are transformed to 0 to avoid sign shifting, as it generally comes with non significant p-values. |
weighted |
Single logical value, whether to divide t (or log-transformed t) by gene mean or not. We recommend to normalize data only by samples and use |
... |
Further arguments are passed to |
Value
Always returns a row named numeric matrix, with a "t" column holding statistics computed. If p.value
is TRUE, a second "p.value" column is added.
Note
Using a numeric matrix as data
and a factor as response
is the fastest way to compute coefficients, if time consumption matters (as in cross-validation schemes). formula
was added only for consistency with other R modeling functions, and eventually to subset features to compute coefficients for.
Author(s)
Sylvain Mareschal
References
http://www.bioconductor.org/packages/release/bioc/html/limma.html
See Also
Examples
# Data with features in columns
data(rosenwald)
group <- rosenwald.cli$group
expr <- t(rosenwald.expr)
# All features, all samples
k <- LPS.coeff(data=expr, response=group)
k <- LPS.coeff(formula=group~1, data=as.data.frame(expr))
### LPS.coeff(formula=group~., data=as.data.frame(expr), na.action=na.pass)
### The last is correct but (really) slow on large datasets
# Feature subset, all samples
k <- LPS.coeff(data=expr[, c("27481","17013") ], response=group)
k <- LPS.coeff(formula=group~`27481`+`17013`, data=as.data.frame(expr))
### Notice backticks in formula for syntactically invalid names
# All features, sample subset
training <- rosenwald.cli$set == "Training"
### training <- sample.int(nrow(expr), 10)
### training <- which(rosenwald.cli$set == "Training")
### training <- rownames(subset(rosenwald.cli, set == "Training"))
k <- LPS.coeff(data=expr, response=group, subset=training)
k <- LPS.coeff(formula=group~1, data=as.data.frame(expr), subset=training)
# NA handling by model.frame()
k <- LPS.coeff(formula=group~1, data=as.data.frame(expr), na.action=na.omit)