ExpectedValueKmerAA {ftrCOOL} | R Documentation |
Expected Value for K-mer Amino Acid (ExpectedValueKmerAA)
Description
This function computes the expected value of each k-mer by dividing the frequency of the kmer to multiplying frequency of each amino acid of the k-mer in the sequence.
Usage
ExpectedValueKmerAA(seqs, k = 2, normalized = TRUE, label = c())
Arguments
seqs |
is a FASTA file with amino acid sequences. Each sequence starts with a '>' character. Also, seqs could be a string vector. Each element of the vector is a peptide/protein sequence. |
k |
is an integer value and it shows the size of kmer in the kmer composition. The default value is 2. |
normalized |
is a logical parameter. When it is FALSE, the return value of the function does not change. Otherwise, the return value is normalized using the length of the sequence. |
label |
is an optional parameter. It is a vector whose length is equivalent to the number of sequences. It shows the class of each entry (i.e., sequence). |
Details
ExpectedValue(k-mer) = freq(k-mer) / ( freq(aminoacid1) * freq(aminoacid2) * ... * freq(aminoacidk) )
Value
This function returns a feature matrix. The number of rows equals the number of sequences and the number of columns if upto set false, is 20^k.
Examples
filePrs<-system.file("extdata/proteins.fasta",package="ftrCOOL")
mat<-ExpectedValueKmerAA(filePrs,k=2,normalized=FALSE)