R: Compute _k_-mer Features

compute_kmer {LncFinder}

R Documentation

Compute k-mer Features

Description

This function can calculate the k-mer frequencies of the sequences.

Usage

compute_kmer(
  Sequences,
  label = NULL,
  k = 1:5,
  step = 1,
  freq = TRUE,
  improved.mode = FALSE,
  alphabet = c("a", "c", "g", "t"),
  on.ORF = FALSE,
  auto.full = FALSE,
  parallel.cores = 2
)

Arguments

`Sequences`	A FASTA file loaded by function `read.fasta` of `seqinr-package`.
`label`	Optional. String. Indicate the label of the sequences such as "NonCoding", "Coding".
`k`	An integer that indicates the sliding window size. (Default: `1:5`)
`step`	Integer defaulting to `1` for the window step.
`freq`	Logical. If TRUE, the frequencies of different patterns are returned instead of counts. (Default: `TRUE`)
`improved.mode`	Logical. If TRUE, the frequencies will be normalized using the method proposed by PLEK (Li et al. 2014). Ignored if `freq = FALSE`. (Default: `FALSE`)
`alphabet`	A vector of single characters that specify the different character of the sequence. (Default: `alphabet = c("a", "c", "g", "t")`)
`on.ORF`	Logical. If `TRUE`, the k-mer frequencies will be calculated on the longest ORF region. NOTE: If `TRUE`, the sequences have to be DNA. (Default: `FALSE`)
`auto.full`	Logical. When `on.ORF = TRUE` but no ORF can be found, if `auto.full = TRUE`, the k-mer frequencies will be calculated on the full sequence automatically; if `auto.full` is `FALSE`, the sequences that have no ORF will be discarded. Ignored when `on.ORF = FALSE`. (Default: `FALSE`)
`parallel.cores`	Integer. The number of cores for parallel computation. By default the number of cores is `2`. Users can set as `-1` to run this function with all cores.

Details

This function can extract k-mer features. k and step can be customized. The count (freq = FALSE) or frequencies (freq = TRUE) of different patterns can be returned. If freq = TRUE, improved.mode is available. The improved mode is proposed by method PLEK. (Ref: Li et al. 2014)

Value

A dataframe.

Author(s)

HAN Siyu

Examples

## Not run: 
data(demo_DNA.seq)
Seqs <- demo_DNA.seq

kmer_res1 <- compute_kmer(Seqs, k = 1:5, step = 1, freq = TRUE, improved.mode = FALSE)

kmer_res2 <- compute_kmer(Seqs, k = 1:5, step = 3, freq = TRUE,
                          improved.mode = TRUE, on.ORF = TRUE, auto.full = TRUE)

## End(Not run)

[Package LncFinder version 1.1.5 Index]