fill_blast_result {disprose} | R Documentation |
Complement BLAST result
Description
Provides subjects' GenInfo Identifiers if BLAST alignment result does not contain one.
Usage
fill_blast_results(
blast.result,
AcNum.column.name = "Racc",
GI.column.name = "Rgi",
delete.version = FALSE,
version.sep = ".",
add.gi = "DB",
add.gi.df,
temp.db = NULL,
delete.temp = FALSE,
add.gi.db = NULL,
add.gi.table = NULL,
add.gi.ac.column.name = "AC",
add.gi.gi.column.name = "GI",
mc.cores = 1,
verbose = TRUE
)
delete_AcNum_version(ac.num.var, version.sep = ".", mc.cores = 1)
Arguments
blast.result |
data frame; BLAST alignment result |
AcNum.column.name , GI.column.name |
character; name of column with subject accession numbers and GenInfo Identifier numbers from BLAST result data frame |
delete.version |
logical; remove version suffix from subject accession number |
version.sep |
character; accession number and version suffix separator (a dot for NCBI accession numbers) |
add.gi |
character; table with linked accession and GI numbers is taken from
SQLite database ( |
add.gi.df |
data frame with table (used if |
temp.db |
character; temporal SQLite database name and path |
delete.temp |
logical; delete created temporal SQLite database |
add.gi.db , add.gi.table , add.gi.ac.column.name , add.gi.gi.column.name |
SQLite database name and path,
table name and name of columns with accession and GI numbers (used if |
mc.cores |
integer; number of processors for parallel computation (not supported on Windows) |
verbose |
logical; show messages |
ac.num.var |
vector of accession numbers |
Details
BLAST alignment, performed with local database, may not contain subject GI information. Also subject accession may contain version suffix. This can make it difficult to analyze the results further. This function adds subject GI and removes subject accession version suffix.
To add GI GenInfo Identifiers table with them linked to accession numbers must be provided as data frame or SQLite database table.
add.gi.df
must be a data frame with column one - accession numbers, column two - GenInfo Identifier numbers.
If add.gi = "DF"
temporal SQLite database is created.
SQLite database table with accession and GI numbers should not contain duplicated rows. It is also highly recommended to index accession numbers' variable in database.
delete.version
executes in the first step, so if you use this option accession numbers
in add.gi
table must not contain version suffix.
AcNum.column.name
, GI.column.name
, add.gi.ac.column.name
and dd.gi.gi.column.name
must be column names exactly as in data frame.
Value
blast.result
data frame with added GI and deleted accession version suffix.
Functions
-
fill_blast_results
: Provides subjects' Genbank Identifiers if BALST alignment result does not contain one -
delete_AcNum_version
: Remove accession version suffix
Author(s)
Elena N. Filatova
Examples
path <- tempdir()
dir.create (path)
# load raw blast results
data (blast.raw)
#load meta.target with result (targets' sequences) GI and Acc.nums
data (meta.target)
blast.fill <- fill_blast_results(blast.result = blast.raw, delete.version = TRUE,
add.gi = "DF", add.gi.df = meta.target[, c("GB_AcNum", "gi")],
temp.db = paste0 (path, "/temp.db"), delete.temp = TRUE)