extractPSSMAcc {protr} | R Documentation |
Profile-based protein representation derived by PSSM (Position-Specific Scoring Matrix) and auto cross covariance
Description
This function calculates the feature vector based on the PSSM by running PSI-Blast and auto cross covariance tranformation.
Usage
extractPSSMAcc(pssmmat, lag)
Arguments
pssmmat |
The PSSM computed by |
lag |
The lag parameter. Must be less than the number of amino acids in the sequence (i.e. the number of columns in the PSSM matrix). |
Value
A length lag * 20^2
named numeric vector,
the element names are derived by the amino acid name abbreviation
(crossed amino acid name abbreviation) and lag index.
Author(s)
Nan Xiao <https://nanx.me>
References
Wold, S., Jonsson, J., Sjorstrom, M., Sandberg, M., & Rannar, S. (1993). DNA and peptide sequences and chemical processes multivariately modelled by principal component analysis and partial least-squares projections to latent structures. Analytica chimica acta, 277(2), 239–253.
See Also
extractPSSM extractPSSMFeature
Examples
if (Sys.which("makeblastdb") == "" | Sys.which("psiblast") == "") {
cat("Cannot find makeblastdb or psiblast. Please install NCBI Blast+")
} else {
x <- readFASTA(system.file(
"protseq/P00750.fasta",
package = "protr"
))[[1]]
dbpath <- tempfile("tempdb", fileext = ".fasta")
invisible(file.copy(from = system.file(
"protseq/Plasminogen.fasta",
package = "protr"
), to = dbpath))
pssmmat <- extractPSSM(seq = x, database.path = dbpath)
pssmacc <- extractPSSMAcc(pssmmat, lag = 3)
tail(pssmacc)
}