vqsresub {longreadvqs}R Documentation

Computing viral quasispecies diversity metrics of error-minimized repeatedly down-sampled read alignments

Description

Minimizes potential long-read sequencing error based on the specified cut-off percentage of low frequency nucleotide base and repeatedly down-samples read for sensitivity analysis of the diversity metrics varied by different sample sizes. The output of this function is a summary of viral quasispecies diversity metrics per each iteration of down-sampling calculated by QSutils package's functions. This function is an extension of "vqssub" function.

Arguments

fasta

Input as a read alignment in FASTA format

iter

Number of iterations for downsampling after error minimization.

method

Sequencing error minimization methods that replace low frequency nucleotide base (less than the "pct" cut-off) with consensus base of that position ("conbase": default) or with base of the dominant haplotype ("domhapbase").

pct

Percent cut-off defining low frequency nucleotide base that will be replaced (must be specified).

gappct

The percent cut-off particularly specified for gap (-). If it is not specified or less than "pct", "gappct" will be equal to "pct" (default).

ignoregappositions

Replace all nucleotides in the positions in the alignment containing gap(s) with gap. This will make such positions no longer single nucleotide variant (SNV). The default is "FALSE".

samsize

Sample size (number of reads) after down-sampling. If it is not specified or more than number of reads in the original alignment, down-sampling will not be performed (default).

label

String within quotation marks indicating name of read alignment (optional).

Value

Data frame containing all viral quasispecies diversity metrics calculated by QSutils package, error minimization, and down-sampling information per each downsampling iteration.

Examples

## Locate input FASTA file-------------------------------------------------------------------------
fastafilepath <- system.file("extdata", "s1.fasta", package = "longreadvqs")

## Summarize viral quasispecies diversity metrics from five downsampling iterations.---------------
vqsresub(fastafilepath, iter = 5, pct = 10, samsize = 20, label = "sample1")


[Package longreadvqs version 0.1.2 Index]