galah_select {galah} | R Documentation |
Specify fields for occurrence download
Description
GBIF nodes store content in hundreds of
different fields, and users often require thousands or millions of records at
a time. To reduce time taken to download data, and limit complexity of the
resulting tibble
, it is sensible to restrict the fields returned by
atlas_occurrences()
. This function allows easy selection of fields, or
commonly-requested groups of columns, following syntax shared with
dplyr::select()
.
The full list of available fields can be viewed with show_all(fields)
. Note
that select()
and galah_select()
are supported for all atlases that allow
downloads, with the exception of GBIF, for which all columns are returned.
Usage
galah_select(..., group)
## S3 method for class 'data_request'
select(.data, ..., group)
Arguments
... |
zero or more individual column names to include |
group |
|
.data |
An object of class |
Details
Calling the argument group = "basic"
returns the following columns:
-
decimalLatitude
-
decimalLongitude
-
eventDate
-
scientificName
-
taxonConceptID
-
recordID
-
dataResourceName
-
occurrenceStatus
Using group = "event"
returns the following columns:
-
eventRemarks
-
eventTime
-
eventID
-
eventDate
-
samplingEffort
-
samplingProtocol
Using group = "media"
returns the following columns:
-
multimedia
-
multimediaLicence
-
images
-
videos
-
sounds
Using group = "taxonomy"
returns higher taxonomic information for a given
query. It is the only group
that is accepted by atlas_species()
as well
as atlas_occurrences()
.
Using group = "assertions"
returns all quality assertion-related
columns. The list of assertions is shown by show_all_assertions()
.
For atlas_occurrences()
, arguments passed to ...
should be valid field
names, which you can check using show_all(fields)
. For atlas_species()
,
it should be one or more of:
-
counts
to include counts of occurrences per species. -
synonyms
to include any synonymous names. -
lists
to include authoritiative lists that each species is included on.
Value
A tibble
specifying the name and type of each column to include in the
call to atlas_counts()
or atlas_occurrences()
.
See Also
search_taxa()
, galah_filter()
and
galah_geolocate()
for other ways to restrict the information returned
by atlas_occurrences()
and related functions; atlas_counts()
for how to get counts by levels of variables returned by galah_select
;
show_all(fields)
to list available fields.
Examples
## Not run:
# Download occurrence records of *Perameles*,
# Only return scientificName and eventDate columns
galah_config(email = "your-email@email.com")
galah_call() |>
galah_identify("perameles")|>
galah_select(scientificName, eventDate) |>
atlas_occurrences()
# Only return the "basic" group of columns and the basisOfRecord column
galah_call() |>
galah_identify("perameles") |>
galah_select(basisOfRecord, group = "basic") |>
atlas_occurrences()
# When used in a pipe, `galah_select()` and `select()` are synonymous.
# Hence the previous example can be rewritten as:
request_data() |>
identify("perameles") |>
select(basisOfRecord, group = "basic") |>
collect()
## End(Not run)