| scrapenames {taxize} | R Documentation | 
Resolve names using Global Names Recognition and Discovery.
Description
Uses the Global Names Recognition and Discovery service, see http://gnrd.globalnames.org/
Note: this function sometimes gives data back and sometimes not. The API that this function is extremely buggy.
Usage
scrapenames(
  url = NULL,
  file = NULL,
  text = NULL,
  engine = NULL,
  unique = NULL,
  verbatim = NULL,
  detect_language = NULL,
  all_data_sources = NULL,
  data_source_ids = NULL,
  return_content = FALSE,
  ...
)
Arguments
| url | An encoded URL for a web page, PDF, Microsoft Office document, or image file, see examples | 
| file | When using multipart/form-data as the content-type, a file may be sent. This should be a path to your file on your machine. | 
| text | Type: string. Text content; best used with a POST request, see examples | 
| engine | (optional) (integer) Default: 0. Either 1 for TaxonFinder, 2 for NetiNeti, or 0 for both. If absent, both engines are used. | 
| unique | (optional) (logical) If  | 
| verbatim | (optional) Type: boolean, If  | 
| detect_language | (optional) Type: boolean, When  | 
| all_data_sources | (optional) Type: boolean. Resolve found names against all available Data Sources. | 
| data_source_ids | (optional) Type: string. Pipe separated list of data source ids to resolve found names against. See list of Data Sources http://resolver.globalnames.org/data_sources | 
| return_content | (logical) return OCR'ed text. returns text
string in  | 
| ... | Further args passed to crul::verb-GET | 
Details
One of url, file, or text must be specified - and only one of them.
Value
A list of length two, first is metadata, second is the data as a data.frame.
Author(s)
Scott Chamberlain
Examples
## Not run: 
# Get data from a website using its URL
scrapenames('https://en.wikipedia.org/wiki/Spider')
scrapenames('https://en.wikipedia.org/wiki/Animal')
scrapenames('https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0095068')
scrapenames('https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0080498')
scrapenames('http://ucjeps.berkeley.edu/cgi-bin/get_JM_treatment.pl?CARYOPHYLLACEAE')
# Scrape names from a pdf at a URL
url <- 'https://journals.plos.org/plosone/article/file?id=
10.1371/journal.pone.0058268&type=printable'
scrapenames(url = sub('\n', '', url))
# With arguments
scrapenames(url = 'https://www.mapress.com/zootaxa/2012/f/z03372p265f.pdf',
  unique=TRUE)
scrapenames(url = 'https://en.wikipedia.org/wiki/Spider',
  data_source_ids=c(1, 169))
# Get data from a file
speciesfile <- system.file("examples", "species.txt", package = "taxize")
scrapenames(file = speciesfile)
nms <- paste0(names_list("species"), collapse="\n")
file <- tempfile(fileext = ".txt")
writeLines(nms, file)
scrapenames(file = file)
# Get data from text string
scrapenames(text='A spider named Pardosa moesta Banks, 1892')
# return OCR content
scrapenames(url='https://www.mapress.com/zootaxa/2012/f/z03372p265f.pdf',
  return_content = TRUE)
## End(Not run)