TFA.estimate {plsgenomics} | R Documentation |
Prediction of Transcription Factor Activities using PLS
Description
The function TFA.estimate
estimates the transcription factor activities from gene
expression data and ChIP data using the PLS multivariate regression approach described
in Boulesteix and Strimmer (2005).
Usage
TFA.estimate(CONNECdata, GEdata, ncomp=NULL, nruncv=0, alpha=2/3, unit.weights=TRUE)
Arguments
CONNECdata |
a (n x p) matrix containing the ChIP data for the n genes and the
p predictors. The n genes must be the same as the n genes of |
GEdata |
a (n x m) matrix containing the gene expression levels of the n
considered genes for m samples. Each row of |
ncomp |
if |
nruncv |
the number of cross-validation iterations to be performed for the choice of
the number of latent components. If |
alpha |
the proportion of genes to be included in the training set for the cross-validation procedure. |
unit.weights |
If |
Details
The gene expression data as well as the ChIP data are assumed to have been
properly normalized. However, they do not have to be centered or scaled, since
centering and scaling are performed by the function TFA.estimate
as a
preliminary step.
The matrix ChIPdata
containing the ChIP data for the n genes and p transcription
factors might be replaced by any 'connectivity' matrix whose entries give the strength
of the interactions between the genes and transcription factors. For instance, a connectivity
matrix obtained by aggregating qualitative information from various genomic databases
might be used as argument in place of ChIP data.
Value
A list with the following components:
TFA |
a (p x m) matrix containing the estimated transcription factor activities for the p transcription factors and the m samples. |
metafactor |
a (m x |
ncomp |
the number of latent components used in the PLS regression. |
Author(s)
Anne-Laure Boulesteix (https://www.ibe.med.uni-muenchen.de/mitarbeiter/professoren/boulesteix/index.html) and Korbinian Strimmer (https://strimmerlab.github.io/korbinian.html).
References
A. L. Boulesteix and K. Strimmer (2005). Predicting Transcription Factor Activities from Combined Analysis of Microarray and ChIP Data: A Partial Least Squares Approach.
A. L. Boulesteix, K. Strimmer (2007). Partial least squares: a versatile tool for the analysis of high-dimensional genomic data. Briefings in Bioinformatics 7:32-44.
S. de Jong (1993). SIMPLS: an alternative approach to partial least squares regression, Chemometrics Intell. Lab. Syst. 18, 251–263.
See Also
pls.regression
, pls.regression.cv
.
Examples
# load plsgenomics library
library(plsgenomics)
# load Ecoli data
data(Ecoli)
# estimate TFAs based on 3 latent components
TFA.estimate(Ecoli$CONNECdata,Ecoli$GEdata,ncomp=3,nruncv=0)
# estimate TFAs and determine the best number of latent components simultaneously
TFA.estimate(Ecoli$CONNECdata,Ecoli$GEdata,ncomp=1:5,nruncv=20)