blast_local {disprose}R Documentation

Local BLAST

Description

Perform nucleotide BLAST with local database

Usage

blast_local(
  probe.var,
  probe.id.var = NULL,
  fasta.way = NULL,
  blastn.way = NULL,
  db.way = NULL,
  out.way = NULL,
  mc.cores = 1,
  add.query.info = FALSE,
  temp.db = NULL,
  delete.files = FALSE,
  eval = 1000,
  ws = 7,
  reward = 1,
  penalty = -3,
  gapopen = 5,
  gapextend = 2,
  maxtargetseqs = 500,
  verbose = TRUE
)

Arguments

probe.var

character; query - vector of nucleotide sequences

probe.id.var

vector of identification numbers for query sequences

fasta.way

character; name and path to FASTA file

blastn.way

character; name and path to blastn executable file

db.way

character; name and path to local BLAST database

out.way

character; name and path to blastn output file

mc.cores

integer; number of processors for parallel computation (not supported on Windows)

add.query.info

logical; add query nucleotide sequence and its length to result

temp.db

character; temporal SQLite database name and path

delete.files

logical; delete created FASTA and output files

eval

integer; expect value for saving hits

ws

integer; length of initial exact match

reward

integer; reward for a nucleotide match

penalty

integer; penalty for a nucleotide mismatch

gapopen

integer; cost to open a gap

gapextend

integer; cost to extend a gap

maxtargetseqs

integer; number of aligned sequences to keep

verbose

logical; show messages

Details

For this function BLAST+ executables (blastn) must be installed and local nucleotide database must be created.

While working, the function creates blastn input FASTA file and output file. If files exist already, they will be overwritten. Those files could be deleted by delete.files = TRUE parameter.

If no probe.id.var is provided, query sequences are numbered in order, starting with 1.

Query cover is query coverage per HSP (as a percentage)

If add.query.info = TRUE function saves data in temporal SQLite database. Function will stop if same database already exists, so deleting temporal database (by setting delete.files = TRUE) is highly recommended.

"no lines available in input" error is returned when there are no BLAST results matching the specified parameters. Adjust BLAST parameters.

Value

Data frame with BLAST alignments: query sequence id, start and end of alignment in query, subject GI, accession, title and taxon id, start and end of alignment in subject, length of alignment, number of mismatches and gaps, number of identical matches, raw score, bit score, expect value and query cover. If add.result.info = TRUE, query sequence and its length are also added to data frame.

Author(s)

Elena N. Filatova

References

Camacho C., Coulouris G., Avagyan V. et al. (2009). BLAST+: architecture and applications. BMC Bioinformatics 10, 421. https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-10-421.

Examples

## Not run: 
# This function is using BLAST applications. BLAST+ should be installed.
# Local nucleotide database should be created
# Local database of target sequences of Chlamydia pneumoniae was created
# in temporal directory previously (see make_blast_DB () function)
path <- tempdir()
dir.create (path)
#set probes for local BLAST
probes <- c ("catctctatttcggtagcagctcc", "aaagtcatagaaaagcctgtagtcgc",
            "ccttcttctcgaactctgaagtacact", "aaaaaaaaaaaaaaaaa", "acacacacacacaac")
blast.raw <- blast_local(probe.var = probes, probe.id.var = NULL,
                        fasta.way = paste0 (path, "/blast.fasta"),
                        blastn.way = "D:/Blast/blast-2.11.0+/bin/blastn.exe",
                        db.way = paste0 (path, "/DB"),
                        out.way = paste0 (path, "/blast.out"),
                        mc.cores=1, add.query.info = TRUE, temp.db = paste0 (path, "/temp.db"),
                        delete.files = TRUE, eval = 1, maxtargetseqs = 200)

## End(Not run)


[Package disprose version 0.1.6 Index]