loadSQMlite {SQMtools} | R Documentation |
Load tables generated by sqm2tables.py
, sqmreads2tables.py
or combine-sqm-tables.py
into R.
Description
This function takes the path to the output directory generated by sqm2tables.py
, sqmreads2tables.py
or combine-sqm-tables.py
a SQMlite object.
The SQMlite object will contain taxonomic and functional profiles, but no detailed information on ORFs, contigs or bins. However, it will also have a much smaller memory footprint.
A SQMlite object can be used for plotting and exporting, but it can not be subsetted.
Usage
loadSQMlite(tables_path, tax_mode = "allfilter")
Arguments
tables_path |
character, tables directory generated by |
tax_mode |
character, which taxonomic classification should be loaded? SqueezeMeta applies the identity thresholds described in Luo et al., 2014. Use |
Value
SQMlite object containing the parsed tables.
The SQMlite object structure
The SQMlite object is a nested list which contains the following information:
lvl1 | lvl2 | lvl3 | type | rows/names | columns | data |
$taxa | $superkingdom | $abund | numeric matrix | superkingdoms | samples | abundances |
$percent | numeric matrix | superkingdoms | samples | percentages | ||
$phylum | $abund | numeric matrix | phyla | samples | abundances | |
$percent | numeric matrix | phyla | samples | percentages | ||
$class | $abund | numeric matrix | classes | samples | abundances | |
$percent | numeric matrix | classes | samples | percentages | ||
$order | $abund | numeric matrix | orders | samples | abundances | |
$percent | numeric matrix | orders | samples | percentages | ||
$family | $abund | numeric matrix | families | samples | abundances | |
$percent | numeric matrix | families | samples | percentages | ||
$genus | $abund | numeric matrix | genera | samples | abundances | |
$percent | numeric matrix | genera | samples | percentages | ||
$species | $abund | numeric matrix | species | samples | abundances | |
$percent | numeric matrix | species | samples | percentages | ||
$functions | $KEGG | $abund | numeric matrix | KEGG ids | samples | abundances (reads) |
$bases | numeric matrix | KEGG ids | samples | abundances (bases) | ||
$tpm | numeric matrix | KEGG ids | samples | tpm | ||
$copy_number | numeric matrix | KEGG ids | samples | avg. copies | ||
$COG | $abund | numeric matrix | COG ids | samples | abundances (reads) | |
$bases | numeric matrix | COG ids | samples | abundances (bases) | ||
$tpm | numeric matrix | COG ids | samples | tpm | ||
$copy_number | numeric matrix | COG ids | samples | avg. copies | ||
$PFAM | $abund | numeric matrix | PFAM ids | samples | abundances (reads) | |
$bases | numeric matrix | PFAM ids | samples | abundances (bases) | ||
$tpm | numeric matrix | PFAM ids | samples | tpm | ||
$copy_number | numeric matrix | PFAM ids | samples | avg. copies | ||
$total_reads | numeric vector | samples | (n/a) | total reads | ||
$misc | $project_name | character vector | (empty) | (n/a) | project name | |
$samples | character vector | (empty) | (n/a) | samples | ||
$tax_names_long | $superkingdom | character vector | short names | (n/a) | full names | |
$phylum | character vector | short names | (n/a) | full names | ||
$class | character vector | short names | (n/a) | full names | ||
$order | character vector | short names | (n/a) | full names | ||
$family | character vector | short names | (n/a) | full names | ||
$genus | character vector | short names | (n/a) | full names | ||
$species | character vector | short names | (n/a) | full names | ||
$tax_names_short | character vector | full names | (n/a) | short names | ||
$KEGG_names | character vector | KEGG ids | (n/a) | KEGG names | ||
$KEGG_paths | character vector | KEGG ids | (n/a) | KEGG hiararchy | ||
$COG_names | character vector | COG ids | (n/a) | COG names | ||
$COG_paths | character vector | COG ids | (n/a) | COG hierarchy | ||
$ext_annot_sources | character vector | (empty) | (n/a) | external databases | ||
If external databases for functional classification were provided to SqueezeMeta or SqueezeMeta_reads via the -extdb
argument, the corresponding abundance, tpm and copy number profiles will be present in SQM$functions
(e.g. results for the CAZy database would be present in SQM$functions$CAZy
). Additionally, the extended names of the features present in the external database will be present in SQM$misc
(e.g. SQM$misc$CAZy_names
). Note that results generated by SqueezeMeta_reads will contain only read abundances, but not bases, tpm or copy number estimations.
See Also
plotBars
and plotFunctions
will plot the most abundant taxa and functions in a SQMlite object. exportKrona
will generate Krona charts reporting the taxonomy in a SQMlite object.
Examples
## Not run:
## (outside R)
## Run SqueezeMeta on the test data.
/path/to/SqueezeMeta/scripts/SqueezeMeta.pl -p Hadza -f raw -m coassembly -s test.samples
## Generate the tabular outputs!
/path/to/SqueezeMeta/utils/sqm2tables.py Hadza Hadza/results/tables
## Now go into R.
library(SQMtools)
Hadza = loadSQMlite("Hadza/results/tables")
# Where Hadza is the path to the SqueezeMeta output directory.
# Note that this is not the whole SQM project, just the directory containing the tables.
# It would also work with tables generated by sqmreads2tables.py, or combine-sqm-tables.py
plotTaxonomy(Hadza)
plotFunctions(Hadza)
exportKrona(Hadza, 'myKronaTest.html')
## End(Not run)