KmerCount {microclass} | R Documentation |
K-mer counting
Description
Counting overlapping words of length K in DNA/RNA sequences.
Usage
KmerCount(sequences, K = 1, col.names = FALSE)
Arguments
sequences |
Vector of sequences (text). |
K |
Word length (integer). |
col.names |
Logical indicating if the words should be added as columns names. |
Details
For each input sequence, the frequency of every word of length K
is counted.
Counting is done with overlap. The counting itself is done by a C++ function.
With col.names=TRUE
the K-mers are added as column names, but this makes the
computations slower.
Value
A matrix with one row for each sequence in sequences
and one column for
each possible word of lengthK
.
Author(s)
Kristian Hovde Liland and Lars Snipen.
See Also
multinomTrain
, multinomClassify
.
Examples
KmerCount("ATGCCTGAACTGACCTGC",K=2)
[Package microclass version 1.2 Index]