APAAC {ftrCOOL} | R Documentation |
Amphiphilic Pseudo-Amino Acid Composition(series) (APAAC)
Description
This function calculates the amphiphilic pseudo amino acid composition (Series) for each sequence.
Usage
APAAC(
seqs,
aaIDX = c("ARGP820101", "HOPT810101"),
lambda = 30,
w = 0.05,
l = 1,
threshold = 1,
label = c()
)
Arguments
seqs |
is a FASTA file with amino acid sequences. Each sequence starts with a '>' character. Also, seqs could be a string vector. Each element of the vector is a peptide/protein sequence. |
aaIDX |
is a vector of Ids or indexes of the user-selected physicochemical properties in the aaIndex2 database. The default values of the vector are the hydrophobicity ids and hydrophilicity ids in the amino acid index file. |
lambda |
is a tuning parameter. Its value indicates the maximum number of spaces between amino acid pairs. The number changes from 1 to lambda. |
w |
(weight) is a tuning parameter. It changes in from 0 to 1. The default value is 0.05. |
l |
This parameter keeps the value of l in lmer composition. The lmers form the first 20^l elements of the APAAC descriptor. |
threshold |
is a number between (0 , 1]. In aaIDX, indices with a correlation higher than the threshold will be deleted. The default value is 1. |
label |
is an optional parameter. It is a vector whose length is equivalent to the number of sequences. It shows the class of each entry (i.e., sequence). |
Details
This function computes the pseudo amino acid composition for each physicochemical property. We have provided users with the ability to choose among different properties (i.e., not confined to hydrophobicity or hydrophilicity).
Value
A feature matrix such that the number of columns is 20^l+(number of chosen aaIndex*lambda) and the number of rows equals the number of sequences.
Examples
filePrs<-system.file("extdata/proteins.fasta",package="ftrCOOL")
mat<-APAAC(seqs=filePrs,l=2,lambda=3,threshold=1)