R: Computing viral quasispecies diversity metrics of...

vqssub {longreadvqs}

R Documentation

Computing viral quasispecies diversity metrics of error-minimized down-sampled read alignment

Description

Minimizes potential long-read sequencing error based on the specified cut-off percentage of low frequency nucleotide base and down-samples read for further comparison with other samples. The output of this function is a summary of viral quasispecies diversity metrics calculated by QSutils package's functions. This function is a subset of "vqsassess" function.

Arguments

`fasta`	Input as a read alignment in FASTA format
`method`	Sequencing error minimization methods that replace low frequency nucleotide base (less than the "pct" cut-off) with consensus base of that position ("conbase": default) or with base of the dominant haplotype ("domhapbase").
`samplingfirst`	Downsampling before (TRUE) or after (FALSE: default) the error minimization.
`pct`	Percent cut-off defining low frequency nucleotide base that will be replaced (must be specified).
`gappct`	The percent cut-off particularly specified for gap (-). If it is not specified or less than "pct", "gappct" will be equal to "pct" (default).
`ignoregappositions`	Replace all nucleotides in the positions in the alignment containing gap(s) with gap. This will make such positions no longer single nucleotide variant (SNV). The default is "FALSE".
`samsize`	Sample size (number of reads) after down-sampling. If it is not specified or more than number of reads in the original alignment, down-sampling will not be performed (default).
`label`	String within quotation marks indicating name of read alignment (optional).

Value

Data frame containing all viral quasispecies diversity metrics calculated by QSutils package, error minimization, and down-sampling information.

Examples

## Locate input FASTA file-------------------------------------------------------------------------
fastafilepath <- system.file("extdata", "s1.fasta", package = "longreadvqs")

## Summarize viral quasispecies diversity metrics--------------------------------------------------
# From error-minimized unsampled reads (10% cut-off):
vqssub(fastafilepath, pct = 10, label = "sample1")
# From error-minimized sampled reads (n = 20):
vqssub(fastafilepath, pct = 10, samsize = 20, label = "sample1")
# From error-minimized sampled reads with 50% cut-off for gap:
vqssub(fastafilepath, pct = 10, gappct = 50, samsize = 20, label = "sample1")
# From error-minimized sampled reads but ignore positions with gap:
vqssub(fastafilepath, pct = 10, ignoregappositions = TRUE, samsize = 20, label = "sample1")
# From reads that were down-sampled before error minimization:
vqssub(fastafilepath, pct = 10, samplingfirst = TRUE, samsize = 20, label = "sample1")

[Package longreadvqs version 0.1.2 Index]