ExpectedValueKmerAA {ftrCOOL}R Documentation

Expected Value for K-mer Amino Acid (ExpectedValueKmerAA)

Description

This function computes the expected value of each k-mer by dividing the frequency of the kmer to multiplying frequency of each amino acid of the k-mer in the sequence.

Usage

ExpectedValueKmerAA(seqs, k = 2, normalized = TRUE, label = c())

Arguments

seqs

is a FASTA file with amino acid sequences. Each sequence starts with a '>' character. Also, seqs could be a string vector. Each element of the vector is a peptide/protein sequence.

k

is an integer value and it shows the size of kmer in the kmer composition. The default value is 2.

normalized

is a logical parameter. When it is FALSE, the return value of the function does not change. Otherwise, the return value is normalized using the length of the sequence.

label

is an optional parameter. It is a vector whose length is equivalent to the number of sequences. It shows the class of each entry (i.e., sequence).

Details

ExpectedValue(k-mer) = freq(k-mer) / ( freq(aminoacid1) * freq(aminoacid2) * ... * freq(aminoacidk) )

Value

This function returns a feature matrix. The number of rows equals the number of sequences and the number of columns if upto set false, is 20^k.

Examples


filePrs<-system.file("extdata/proteins.fasta",package="ftrCOOL")
mat<-ExpectedValueKmerAA(filePrs,k=2,normalized=FALSE)

[Package ftrCOOL version 2.0.0 Index]