occ_issues {rgbif} | R Documentation |
Parse and examine further GBIF occurrence issues on a dataset.
Description
Parse and examine further GBIF occurrence issues on a dataset.
Usage
occ_issues(.data, ..., mutate = NULL)
Arguments
.data |
Output from a call to |
... |
Named parameters to only get back (e.g. cdround), or to remove (e.g. -cdround). |
mutate |
(character) One of:
For split and split_expand, values in cells become y ("yes") or n ("no") |
Details
See also the vignette Cleaning data using GBIF issues
Note that you can also query based on issues, e.g.,
occ_search(taxonKey=1, issue='DEPTH_UNLIKELY')
. However, I imagine
it's more likely that you want to search for occurrences based on a
taxonomic name, or geographic area, not based on issues, so it makes sense
to pull data down, then clean as needed using this function.
This function only affects the data
element in the gbif
class that is
returned from a call to occ_search()
. Maybe in a future version
we will remove the associated records from the hierarchy
and media
elements as they are removed from the data
element.
You'll notice that we sort columns to make it easier to glimpse the important parts of your data, namely taxonomic name, taxon key, latitude and longitude, and the issues. The columns are unchanged otherwise.
References
https://gbif.github.io/gbif-api/apidocs/org/gbif/api/vocabulary/OccurrenceIssue.html
Examples
## Not run:
# what do issues mean, can print whole table
head(gbif_issues())
# or just occurrence related issues
gbif_issues()[which(gbif_issues()$type %in% c("occurrence")),]
# or search for matches
iss <- c('cdround','cudc','gass84','txmathi')
gbif_issues()[ gbif_issues()$code %in% iss, ]
# compare out data to after occ_issues use
(out <- occ_search(limit=100))
out %>% occ_issues(cdround)
# occ_data
(out <- occ_data(limit=100))
out %>% occ_issues(cdround)
# Parsing output by issue
(res <- occ_data(
geometry='POLYGON((30.1 10.1,40 40,20 40,10 20,30.1 10.1))',
limit = 600))
## or parse issues in various ways
### include only rows with cdround issue
gg <- res %>% occ_issues(cdround)
NROW(res$data)
NROW(gg$data)
head(res$data)[,c(1:5)]
head(gg$data)[,c(1:5)]
### remove data rows with certain issue classes
res %>% occ_issues(-cdround, -cudc)
### split issues into separate columns
res %>% occ_issues(mutate = "split")
res %>% occ_issues(-cudc, -mdatunl, mutate = "split")
res %>% occ_issues(gass84, mutate = "split")
### expand issues to more descriptive names
res %>% occ_issues(mutate = "expand")
### split and expand
res %>% occ_issues(mutate = "split_expand")
### split, expand, and remove an issue class
res %>% occ_issues(-cdround, mutate = "split_expand")
## Or you can use occ_issues without %>%
occ_issues(res, -cdround, mutate = "split_expand")
# from GBIF downloaded data via occ_download_* functions
res <- occ_download_get(key="0000066-140928181241064", overwrite=TRUE)
x <- occ_download_import(res)
occ_issues(x, -txmathi)
occ_issues(x, txmathi)
occ_issues(x, gass84)
occ_issues(x, zerocd)
occ_issues(x, gass84, txmathi)
occ_issues(x, mutate = "split")
occ_issues(x, -gass84, mutate = "split")
occ_issues(x, mutate = "expand")
occ_issues(x, mutate = "split_expand")
# occ_search/occ_data with many inputs - give slightly different output
# format than normal 2482598, 2498387
xyz <- occ_data(taxonKey = c(9362842, 2492483, 2435099), limit = 300)
xyz
length(xyz) # length 3
names(xyz) # matches taxonKey values passed in
occ_issues(xyz, -gass84)
occ_issues(xyz, -cdround)
occ_issues(xyz, -cdround, -gass84)
## End(Not run)