vqsresub {longreadvqs} | R Documentation |
Computing viral quasispecies diversity metrics of error-minimized repeatedly down-sampled read alignments
Description
Minimizes potential long-read sequencing error based on the specified cut-off percentage of low frequency nucleotide base and repeatedly down-samples read for sensitivity analysis of the diversity metrics varied by different sample sizes. The output of this function is a summary of viral quasispecies diversity metrics per each iteration of down-sampling calculated by QSutils package's functions. This function is an extension of "vqssub" function.
Arguments
fasta |
Input as a read alignment in FASTA format |
iter |
Number of iterations for downsampling after error minimization. |
method |
Sequencing error minimization methods that replace low frequency nucleotide base (less than the "pct" cut-off) with consensus base of that position ("conbase": default) or with base of the dominant haplotype ("domhapbase"). |
pct |
Percent cut-off defining low frequency nucleotide base that will be replaced (must be specified). |
gappct |
The percent cut-off particularly specified for gap (-). If it is not specified or less than "pct", "gappct" will be equal to "pct" (default). |
ignoregappositions |
Replace all nucleotides in the positions in the alignment containing gap(s) with gap. This will make such positions no longer single nucleotide variant (SNV). The default is "FALSE". |
samsize |
Sample size (number of reads) after down-sampling. If it is not specified or more than number of reads in the original alignment, down-sampling will not be performed (default). |
label |
String within quotation marks indicating name of read alignment (optional). |
Value
Data frame containing all viral quasispecies diversity metrics calculated by QSutils package, error minimization, and down-sampling information per each downsampling iteration.
Examples
## Locate input FASTA file-------------------------------------------------------------------------
fastafilepath <- system.file("extdata", "s1.fasta", package = "longreadvqs")
## Summarize viral quasispecies diversity metrics from five downsampling iterations.---------------
vqsresub(fastafilepath, iter = 5, pct = 10, samsize = 20, label = "sample1")