get_seq_for_DB {disprose} | R Documentation |
Get nucleotide sequences from NCBI
Description
Retrieves nucleotide sequences from NCBI for given identification numbers.
Usage
get_seq_for_DB(
ids,
db,
check.result = FALSE,
return = "data.frame",
fasta.file = NULL,
exclude.from.download = FALSE,
exclude.var,
exclude.pattern,
exclude.fixed = TRUE,
verbose = TRUE
)
get_seq_for_DB_fix(res.data, db, verbose = TRUE)
Arguments
ids |
vector of NCBI sequences' identification numbers: GenBank accession numbers, GenInfo identifiers (GI) or Entrez unique identifiers (UID) |
db |
character; NCBI database for search. See entrez_dbs() for possible values |
check.result |
logical; check if download was done correctly |
return |
character; sequence returned object; possible values are "vector", "data.frame" and "fasta" |
fasta.file |
character; FASTA file name and path, only used if |
exclude.from.download |
logical; ignore some sequences while downloading |
exclude.var |
vector that is used to define which sequences should be ignored, only used if |
exclude.pattern |
value that matches to |
exclude.fixed |
logical; match |
verbose |
logical; show messages |
res.data |
data.frame; data frame of nucleotide ids and previously downloaded sequences |
Details
Master records (for example, in WGS-project) do not contain any nucleotide.
They might be excluded from download with exclude.from.download
parameters.
However this has no affect and such ids do not have to be excluded when loading.
If writing FASTA to existing FASTA file, sequences are appended.
Value
If return = "vector"
function returns vector of nucleotide sequences,
return = "data.frame"
- data frame with nucleotide ids and nucleotide sequences,
return = "fasta"
- writes FASTA file, no data returned.
Functions
-
get_seq_for_DB
: Retrieves NCBI nucleotide sequences for given identification numbers. -
get_seq_for_DB_fix
: Checks the downloads and tries to retrieve the compromised data.
Author(s)
Elena N. Filatova
Examples
ids<-c(2134240466, 2134240465, 2134240464)
fasta.file<-tempfile()
get_seq_for_DB (ids = ids, db = "nucleotide", check.result = TRUE,
return = "fasta", fasta.file = fasta.file, exclude.from.download=FALSE)
file.remove(fasta.file)