| KmerCount {microclass} | R Documentation |
K-mer counting
Description
Counting overlapping words of length K in DNA/RNA sequences.
Usage
KmerCount(sequences, K = 1, col.names = FALSE)
Arguments
sequences |
Vector of sequences (text). |
K |
Word length (integer). |
col.names |
Logical indicating if the words should be added as columns names. |
Details
For each input sequence, the frequency of every word of length K is counted.
Counting is done with overlap. The counting itself is done by a C++ function.
With col.names=TRUE the K-mers are added as column names, but this makes the
computations slower.
Value
A matrix with one row for each sequence in sequences and one column for
each possible word of lengthK.
Author(s)
Kristian Hovde Liland and Lars Snipen.
See Also
multinomTrain, multinomClassify.
Examples
KmerCount("ATGCCTGAACTGACCTGC",K=2)
[Package microclass version 1.2 Index]