PSEAAC {ftrCOOL} | R Documentation |
Pseudo-Amino Acid Composition (Parallel) (PSEAAC)
Description
This function calculates the pseudo amino acid composition (parallel) for each sequence.
Usage
PSEAAC(
seqs,
aaIDX = c("ARGP820101", "HOPT810101", "Mass"),
lambda = 30,
w = 0.05,
l = 1,
threshold = 1,
label = c()
)
Arguments
seqs |
is a FASTA file with amino acid sequences. Each sequence starts with a '>' character. Also, seqs could be a string vector. Each element of the vector is a peptide/protein sequence. |
aaIDX |
is a vector of Ids or indexes of the user-selected physicochemical properties in the aaIndex2 database. The default values of the vector are the hydrophobicity ids and hydrophilicity ids and Mass of residual in the amino acid index file. |
lambda |
is a tuning parameter. Its value indicates the maximum number of spaces between amino acid pairs. The number changes from 1 to lambda. |
w |
(weight) is a tuning parameter. It changes in from 0 to 1. The default value is 0.05. |
l |
This parameter keeps the value of l in lmer composition. The lmers form the first 20^l elements of the APAAC descriptor. |
threshold |
is a number between (0 , 1]. It deletes aaIndexes which have a correlation bigger than the threshold. The default value is 1. |
label |
is an optional parameter. It is a vector whose length is equivalent to the number of sequences. It shows the class of each entry (i.e., sequence). |
Value
A feature matrix such that the number of columns is 20^l+(lambda) and the number of rows is equal to the number of sequences.
Examples
filePrs<-system.file("extdata/proteins.fasta",package="ftrCOOL")
mat<-PSEAAC(seqs=filePrs,l=2)