taxon_assign {DBTC} | R Documentation |
Assign Taxa using BLAST Results
Description
This function takes a BLAST result file and associated fasta files (either on their own or with accompanying ASV files generated from the dada_implement function) and collapses the multiple BLAST results into as single result for each query sequence. When an ASV table is present the taxonomic results will be combined with the ASV table.
Usage
taxon_assign(
fileLoc = NULL,
taxaDBLoc = NULL,
numCores = 1,
coverage = 95,
ident = 95,
propThres = 0.95,
coverReportThresh = 0,
identReportThresh = 0,
includeAllDada = TRUE,
verbose = TRUE
)
Arguments
fileLoc |
The location of a file in a directory where all of the paired fasta and BLAST (and potentially ASV) files are located (Default NULL). |
taxaDBLoc |
The location of the NCBI taxonomic data base (Default NULL; for accessionTaxa.sql see the main DBTC page for details). |
numCores |
The number of cores used to run the function (Default 1, Windows systems can only use a single core). |
coverage |
The percent coverage used for taxonomic assignment for the above threshold results (Default 95). |
ident |
The percent identity used for the taxonomic assignment for above threshold results (Default 95). |
propThres |
The proportional threshold flags the final result based on the preponderance of the data. So if the threshold is set to 0.95, results will be flagged if the taxa directly below the assigned taxa has fewer than 0.95 percent of the records causing the upward taxonomic placement (Default 0.95). |
coverReportThresh |
The percent coverage threshold used for reporting flags below this threshold (Default 95). |
identReportThresh |
The percent identity threshold used for reporting flags below this threshold (Default 95). |
includeAllDada |
When paired Dada ASV tables are present, when set to FALSE, this will exclude records without taxonomic assignment (Default TRUE). |
verbose |
If set to TRUE then there will be output to the R console, if FALSE then this reporting data is suppressed (Default TRUE). |
Details
This function requires a BLAST output file and an associated fasta file. In addition, if present an ASV file will also be used and combined with the taxonomic results when present. The BLAST results are reduced to a single result for each read. At each taxonomic level there may be one or more taxonomic assignments. Each assignment has quality metrics in parentheses after the name. These values ("Num_Rec", "Coverage", "Identity", "Max_eVal") represent the number of records with this taxonomic placement, the minimum coverage and identity, and the maximum eValue for the reported taxa.
The examples are present to display the syntax for the function. These examples are not run because there are files required to run the functions, in some cases multiple files are necessary and some of these are quite large. To get specific examples please see https://github.com/rgyoung6/DBTCShinyTutorial/blob/main/README.md
Value
This function produces a taxa_reduced file for each submitted BLAST-fasta submission.
Note
WARNING - NO WHITESPACE!
When running DBTC functions the paths for the files selected cannot have white space! File folder locations should be as short as possible (close to the root as some functions do not process long naming conventions.
Also, special characters should be avoided (including question mark, number sign, exclamation mark). It is recommended that dashes be used for separations in naming conventions while retaining underscores for use as information delimiters (this is how DBTC functions use underscore).
There are several key character strings used in the DBTC pipeline, the presence of these strings in file or folder names will cause errors when running DBTC functions.
The following strings are those used in DBTC and should not be used in file or folder naming: - _BLAST - _combinedDada - _taxaAssign - _taxaAssignCombined - _taxaReduced - _CombineTaxaReduced
Author(s)
Robert G. Young
References
<https://github.com/rgyoung6/DBTC> Young, R. G., Hanner, R. H. (Submitted October 2023). Dada-BLAST-Taxon Assign-Condense Shiny Application (DBTCShiny). Biodiversity Data Journal.
See Also
dada_implement() combine_dada_output() make_BLAST_DB() seq_BLAST() combine_assign_output() reduce_taxa() combine_reduced_output()
Examples
## Not run:
taxon_assign()
taxon_assign(fileLoc = NULL, taxaDBLoc = NULL, numCores = 1, coverage = 95,
ident = 95, propThres = 0.95, coverReportThresh=0, identReportThresh=0, includeAllDada=TRUE)
## End(Not run)