KmerCount {microclass}R Documentation

K-mer counting

Description

Counting overlapping words of length K in DNA/RNA sequences.

Usage

KmerCount(sequences, K = 1, col.names = FALSE)

Arguments

sequences

Vector of sequences (text).

K

Word length (integer).

col.names

Logical indicating if the words should be added as columns names.

Details

For each input sequence, the frequency of every word of length K is counted. Counting is done with overlap. The counting itself is done by a C++ function.

With col.names=TRUE the K-mers are added as column names, but this makes the computations slower.

Value

A matrix with one row for each sequence in sequences and one column for each possible word of lengthK.

Author(s)

Kristian Hovde Liland and Lars Snipen.

See Also

multinomTrain, multinomClassify.

Examples

KmerCount("ATGCCTGAACTGACCTGC",K=2)


[Package microclass version 1.2 Index]