getDataPBDB {paleotree} | R Documentation |
Obtaining Data for Taxa or Occurrences From Paleobiology Database API
Description
The Paleobiology Database API (link)
is very easy to use, and generally any data one wishes to collect can be obtained
in R through a variety of ways - the simplest being to wrap a data retrieval request
to the API, specified for CSV output, with R function read.csv
. The functions
listed here, however, are some simple helper functions for doing tasks common to
users of this package - downloading occurrence data, or taxonomic information,
for particular clades, or for a list of specific taxa.
Usage
getCladeTaxaPBDB(
taxon,
showTaxa = c("class", "parent", "app", "img", "entname"),
status = "accepted",
urlOnly = FALSE,
stopIfMissing = FALSE,
failIfNoInternet = TRUE
)
getSpecificTaxaPBDB(
taxa,
showTaxa = c("class", "parent", "app", "img", "entname"),
status = "accepted",
urlOnly = FALSE,
stopIfMissing = FALSE,
failIfNoInternet = TRUE
)
getPBDBocc(
taxa,
showOccs = c("class", "classext", "subgenus", "ident", "entname"),
failIfNoInternet = TRUE
)
Arguments
taxon |
A single name of a of a higher taxon which you wish to catch all taxonomic 'children' (included members - i.e. subtaxa) of, from within the Paleobiology Database. |
showTaxa |
Which variables for taxonomic data should be requested
from the Paleobiology Database? The default is to include classification ( |
status |
What taxonomic status should the pull taxa have?
The default is |
urlOnly |
If |
stopIfMissing |
If some taxa within the requested set appear to be missing from the Paleobiology Database's taxonomy table, should the function halt with an error? |
failIfNoInternet |
If the Paleobiology Database or another
needed internet resource cannot be accessed, perhaps because of
no internet connection, should the function fail (with an error)
or should the function return |
taxa |
A character vector listing taxa of interest that the user
wishes to download information on from the Paleobiology Database.
Multiple taxa can be listed as a single character string, with desired taxa
separated by a comma with no whitespace (ex.
|
showOccs |
Which variables for occurrence data should be requested
from the Paleobiology Database? The default is to include classification ( |
Details
In many cases, it might be easier to write your own query - these
functions are only made to make getting data for some very specific
applications in paleotree
easier.
Value
These functions return a data.frame
containing
variables pulled for the requested taxon selection.
This behavior can be modified by argument urlOnly
.
Author(s)
David W. Bapst
References
Peters, S. E., and M. McClennen. 2015. The Paleobiology Database application programming interface. Paleobiology 42(1):1-7.
See Also
See makePBDBtaxonTree
, makePBDBtaxonTree
,
and plotPhyloPicTree
for functions that use taxonomic data.
Occurrence data is sorted by taxon via taxonSortPBDBocc
,
and further utilized occData2timeList
and plotOccData
.
Examples
# Note that all examples here use argument
# failIfNoInternet = FALSE so that functions do
# not error out but simply return NULL if internet
# connection is not available, and thus
# fail gracefully rather than error out (required by CRAN).
# Remove this argument or set to TRUE so functions fail
# when internet resources (paleobiodb) is not available.
#graptolites
graptData <- getCladeTaxaPBDB("Graptolithina",
failIfNoInternet = FALSE)
dim(graptData)
sum(graptData$taxon_rank == "genus")
# so we can see that our call for graptolithina returned
# a large number of taxa, a large portion of which are
# individual genera
# (554 and 318 respectively, as of 03-18-19)
tetrapodList<-c("Archaeopteryx", "Columba", "Ectopistes",
"Corvus", "Velociraptor", "Baryonyx", "Bufo",
"Rhamphorhynchus", "Quetzalcoatlus", "Natator",
"Tyrannosaurus", "Triceratops", "Gavialis",
"Brachiosaurus", "Pteranodon", "Crocodylus",
"Alligator", "Giraffa", "Felis", "Ambystoma",
"Homo", "Dimetrodon", "Coleonyx", "Equus",
"Sphenodon", "Amblyrhynchus")
tetrapodData <-getSpecificTaxaPBDB(tetrapodList,
failIfNoInternet = FALSE)
dim(tetrapodData)
sum(tetrapodData$taxon_rank == "genus")
# should be 26, with all 26 as genera
#############################################
# Now let's try getting occurrence data
# getting occurrence data for a genus, sorting it
# Dicellograptus
dicelloData <- getPBDBocc("Dicellograptus",
failIfNoInternet = FALSE)
if(!is.null(dicelloData)){
dicelloOcc2 <- taxonSortPBDBocc(dicelloData,
rank = "species", onlyFormal = FALSE,
failIfNoInternet = FALSE)
names(dicelloOcc2)
}