taxon_assign {DBTC}R Documentation

Assign Taxa using BLAST Results

Description

This function takes a BLAST result file and associated fasta files (either on their own or with accompanying ASV files generated from the dada_implement function) and collapses the multiple BLAST results into as single result for each query sequence. When an ASV table is present the taxonomic results will be combined with the ASV table.

Usage

taxon_assign(
  fileLoc = NULL,
  taxaDBLoc = NULL,
  numCores = 1,
  coverage = 95,
  ident = 95,
  propThres = 0.95,
  coverReportThresh = 0,
  identReportThresh = 0,
  includeAllDada = TRUE,
  verbose = TRUE
)

Arguments

fileLoc

The location of a file in a directory where all of the paired fasta and BLAST (and potentially ASV) files are located (Default NULL).

taxaDBLoc

The location of the NCBI taxonomic data base (Default NULL; for accessionTaxa.sql see the main DBTC page for details).

numCores

The number of cores used to run the function (Default 1, Windows systems can only use a single core).

coverage

The percent coverage used for taxonomic assignment for the above threshold results (Default 95).

ident

The percent identity used for the taxonomic assignment for above threshold results (Default 95).

propThres

The proportional threshold flags the final result based on the preponderance of the data. So if the threshold is set to 0.95, results will be flagged if the taxa directly below the assigned taxa has fewer than 0.95 percent of the records causing the upward taxonomic placement (Default 0.95).

coverReportThresh

The percent coverage threshold used for reporting flags below this threshold (Default 95).

identReportThresh

The percent identity threshold used for reporting flags below this threshold (Default 95).

includeAllDada

When paired Dada ASV tables are present, when set to FALSE, this will exclude records without taxonomic assignment (Default TRUE).

verbose

If set to TRUE then there will be output to the R console, if FALSE then this reporting data is suppressed (Default TRUE).

Details

This function requires a BLAST output file and an associated fasta file. In addition, if present an ASV file will also be used and combined with the taxonomic results when present. The BLAST results are reduced to a single result for each read. At each taxonomic level there may be one or more taxonomic assignments. Each assignment has quality metrics in parentheses after the name. These values ("Num_Rec", "Coverage", "Identity", "Max_eVal") represent the number of records with this taxonomic placement, the minimum coverage and identity, and the maximum eValue for the reported taxa.

The examples are present to display the syntax for the function. These examples are not run because there are files required to run the functions, in some cases multiple files are necessary and some of these are quite large. To get specific examples please see https://github.com/rgyoung6/DBTCShinyTutorial/blob/main/README.md

Value

This function produces a taxa_reduced file for each submitted BLAST-fasta submission.

Note

WARNING - NO WHITESPACE!

When running DBTC functions the paths for the files selected cannot have white space! File folder locations should be as short as possible (close to the root as some functions do not process long naming conventions.

Also, special characters should be avoided (including question mark, number sign, exclamation mark). It is recommended that dashes be used for separations in naming conventions while retaining underscores for use as information delimiters (this is how DBTC functions use underscore).

There are several key character strings used in the DBTC pipeline, the presence of these strings in file or folder names will cause errors when running DBTC functions.

The following strings are those used in DBTC and should not be used in file or folder naming: - _BLAST - _combinedDada - _taxaAssign - _taxaAssignCombined - _taxaReduced - _CombineTaxaReduced

Author(s)

Robert G. Young

References

<https://github.com/rgyoung6/DBTC> Young, R. G., Hanner, R. H. (Submitted October 2023). Dada-BLAST-Taxon Assign-Condense Shiny Application (DBTCShiny). Biodiversity Data Journal.

See Also

dada_implement() combine_dada_output() make_BLAST_DB() seq_BLAST() combine_assign_output() reduce_taxa() combine_reduced_output()

Examples

## Not run: 
taxon_assign()
taxon_assign(fileLoc = NULL, taxaDBLoc = NULL, numCores = 1, coverage = 95,
ident = 95, propThres = 0.95, coverReportThresh=0, identReportThresh=0, includeAllDada=TRUE)

## End(Not run)


[Package DBTC version 0.1.0 Index]