loadSQMlite {SQMtools}R Documentation

Load tables generated by sqm2tables.py, sqmreads2tables.py or combine-sqm-tables.py into R.

Description

This function takes the path to the output directory generated by sqm2tables.py, sqmreads2tables.py or combine-sqm-tables.py a SQMlite object. The SQMlite object will contain taxonomic and functional profiles, but no detailed information on ORFs, contigs or bins. However, it will also have a much smaller memory footprint. A SQMlite object can be used for plotting and exporting, but it can not be subsetted.

Usage

loadSQMlite(tables_path, tax_mode = "allfilter")

Arguments

tables_path

character, tables directory generated by sqm2table.py, sqmreads2tables.py or combine-sqm-tables.py.

tax_mode

character, which taxonomic classification should be loaded? SqueezeMeta applies the identity thresholds described in Luo et al., 2014. Use allfilter for applying the minimum identity threshold to all taxa (default), prokfilter for applying the threshold to Bacteria and Archaea, but not to Eukaryotes, and nofilter for applying no thresholds at all.

Value

SQMlite object containing the parsed tables.

The SQMlite object structure

The SQMlite object is a nested list which contains the following information:

lvl1 lvl2 lvl3 type rows/names columns data
$taxa $superkingdom $abund numeric matrix superkingdoms samples abundances
$percent numeric matrix superkingdoms samples percentages
$phylum $abund numeric matrix phyla samples abundances
$percent numeric matrix phyla samples percentages
$class $abund numeric matrix classes samples abundances
$percent numeric matrix classes samples percentages
$order $abund numeric matrix orders samples abundances
$percent numeric matrix orders samples percentages
$family $abund numeric matrix families samples abundances
$percent numeric matrix families samples percentages
$genus $abund numeric matrix genera samples abundances
$percent numeric matrix genera samples percentages
$species $abund numeric matrix species samples abundances
$percent numeric matrix species samples percentages
$functions $KEGG $abund numeric matrix KEGG ids samples abundances (reads)
$bases numeric matrix KEGG ids samples abundances (bases)
$tpm numeric matrix KEGG ids samples tpm
$copy_number numeric matrix KEGG ids samples avg. copies
$COG $abund numeric matrix COG ids samples abundances (reads)
$bases numeric matrix COG ids samples abundances (bases)
$tpm numeric matrix COG ids samples tpm
$copy_number numeric matrix COG ids samples avg. copies
$PFAM $abund numeric matrix PFAM ids samples abundances (reads)
$bases numeric matrix PFAM ids samples abundances (bases)
$tpm numeric matrix PFAM ids samples tpm
$copy_number numeric matrix PFAM ids samples avg. copies
$total_reads numeric vector samples (n/a) total reads
$misc $project_name character vector (empty) (n/a) project name
$samples character vector (empty) (n/a) samples
$tax_names_long $superkingdom character vector short names (n/a) full names
$phylum character vector short names (n/a) full names
$class character vector short names (n/a) full names
$order character vector short names (n/a) full names
$family character vector short names (n/a) full names
$genus character vector short names (n/a) full names
$species character vector short names (n/a) full names
$tax_names_short character vector full names (n/a) short names
$KEGG_names character vector KEGG ids (n/a) KEGG names
$KEGG_paths character vector KEGG ids (n/a) KEGG hiararchy
$COG_names character vector COG ids (n/a) COG names
$COG_paths character vector COG ids (n/a) COG hierarchy
$ext_annot_sources character vector (empty) (n/a) external databases

If external databases for functional classification were provided to SqueezeMeta or SqueezeMeta_reads via the -extdb argument, the corresponding abundance, tpm and copy number profiles will be present in SQM$functions (e.g. results for the CAZy database would be present in SQM$functions$CAZy). Additionally, the extended names of the features present in the external database will be present in SQM$misc (e.g. SQM$misc$CAZy_names). Note that results generated by SqueezeMeta_reads will contain only read abundances, but not bases, tpm or copy number estimations.

See Also

plotBars and plotFunctions will plot the most abundant taxa and functions in a SQMlite object. exportKrona will generate Krona charts reporting the taxonomy in a SQMlite object.

Examples

## Not run: 
## (outside R)
## Run SqueezeMeta on the test data.
/path/to/SqueezeMeta/scripts/SqueezeMeta.pl -p Hadza -f raw -m coassembly -s test.samples
## Generate the tabular outputs!
/path/to/SqueezeMeta/utils/sqm2tables.py Hadza Hadza/results/tables
## Now go into R.
library(SQMtools)
Hadza = loadSQMlite("Hadza/results/tables")
# Where Hadza is the path to the SqueezeMeta output directory.
# Note that this is not the whole SQM project, just the directory containing the tables.
# It would also work with tables generated by sqmreads2tables.py, or combine-sqm-tables.py
plotTaxonomy(Hadza)
plotFunctions(Hadza)
exportKrona(Hadza, 'myKronaTest.html')

## End(Not run)

[Package SQMtools version 1.6.3 Index]