count_multigrams {biogram} | R Documentation |
Detect and count multiple n-grams in sequences
Description
A convinient wrapper around count_ngrams
for counting multiple
values of n
and d
.
Usage
count_multigrams(
ns,
ds = rep(0, length(ns)),
seq,
u,
pos = FALSE,
scale = FALSE,
threshold = 0
)
Arguments
ns |
|
ds |
|
seq |
a vector or matrix describing sequence(s). |
u |
|
pos |
|
scale |
|
threshold |
|
Details
ns
vector and ds
vector must have equal length. Elements of
ds
vector are used as equivalents of d
parameter for respective values
of ns
. For example, if ns
is c(4, 4, 4)
, the ds
must be a list of
length 3. Each element of the ds
list must have length 3 or 1, as appropriate
for a d
parameter in count_ngrams
function.
Value
An integer
matrix with named columns. The naming conventions are the same
as in count_ngrams
.
Examples
seqs <- matrix(sample(1L:4, 600, replace = TRUE), ncol = 50)
count_multigrams(c(3, 1), list(c(1, 0), 0), seqs, 1L:4, pos = TRUE)
# if ds parameter is not present, n-grams are calculated for distance 0
count_multigrams(c(3, 1), seq = seqs, u = 1L:4)
# calculate three times n-gram with the same length, but different distances between
# elements
count_multigrams(c(4, 4, 4), list(c(2, 0, 1), c(2, 1, 0), c(0, 1, 2)),
seqs, 1L:4, pos = TRUE)