R: Comparing viral quasispecies profile and operational...

vqscompare {longreadvqs}

R Documentation

Comparing viral quasispecies profile and operational taxonomic unit (OTU) classified by k-means clustering between samples

Description

Pools error-minimized down-sampled read samples and compares their diversity by 1) viral quasispecies profile (haplotype and metrics from QSutils package), 2) operational taxonomic unit (OTU) classified by k-means clustering of single nucleotide variant (SNV) distance, and 3) visualization of different comparative method, i.e., haplotype, OTU, phylogenetic tree, MDS plot.

Arguments

`samplelist`	List of samples, i.e., name of resulting objects from "vqsassess" or "vqscustompct" functions, for example list(BC1, BC2, BC3).
`lab_name`	Name of variable or type of sample for instance "barcode", "sample", "dpi", or "isolate" (optional).
`kmeans.n`	Number of clusters or operational taxonomic units (OTUs) needed from k-means clustering on multidimensional scale (MDS) of all samples' pairwise SNV distance.
`showhap.n`	Number of largest haplotypes (default = 30) labeled in the top five OTUs' MDS plot (optional).

Value

list of 1) "hapdiv": comparative table of viral quasispecies diversity metrics between listed samples calculated by QSutils package, 2) "otudiv": comparative table of OTU diversity metrics between listed samples calculated from consensus sequence of each OTU (similar to "otucompare" function's output), 3) "sumsnv_hap": frequency and SNV profile (by position in the alignment) of haplotypes that are not singleton (number of reads > 1), 4) "sumsnv_otu": frequency and SNV profile of all haplotypes grouped into different operational taxonomic unit (OTU), 5) "fullseq": complete read sequence of haplotypes that are not singleton, 6) "fulldata": complete read sequence of all haplotypes in every sample with frequency and OTU classification, 7) "summaryplot": visualization of viral quasispecies comparison between samples including 7.1) "happlot": proportion of haplotypes (top left), 7.2) "otuplot": proportion of OTUs (bottom left), and 7.3) multidimensional scale (MDS) plots (right) of k-means OTU ("top5otumds": 5 largest groups with major haplotypes labeled and "allotumds": all groups)

Examples

## Locate input FASTA files-----------------------------------------------------------------------
sample1filepath <- system.file("extdata", "s1.fasta", package = "longreadvqs")
sample2filepath <- system.file("extdata", "s2.fasta", package = "longreadvqs")

## Prepare data for viral quasispecies comparison between two samples-----------------------------
set.seed(123)
sample1 <- vqsassess(sample1filepath, pct = 5, samsize = 50, label = "sample1")
sample2 <- vqsassess(sample2filepath, pct = 5, samsize = 50, label = "sample2")

## Compare viral quasispecies and OTU (4 clusters) diversity between two samples------------------
out <- vqscompare(samplelist = list(sample1, sample2),
           lab_name = "Sample", kmeans.n = 4, showhap.n = 5)
out$summaryplot

[Package longreadvqs version 0.1.2 Index]