| galah_filter {galah} | R Documentation |
Narrow a query by specifying filters
Description
"Filters" are arguments of the form field logical value that are used
to narrow down the number of records returned by a specific query.
For example, it is common for users to request records from a particular year
(year == 2020), or to return all records except for fossils
(basisOfRecord != "FossilSpecimen").
The result of galah_filter() can be passed to the filter
argument in atlas_occurrences(), atlas_species(),
atlas_counts() or atlas_media().
Usage
galah_filter(..., profile = NULL)
## S3 method for class 'data_request'
filter(.data, ...)
## S3 method for class 'metadata_request'
filter(.data, ...)
## S3 method for class 'files_request'
filter(.data, ...)
Arguments
... |
filters, in the form |
profile |
|
.data |
An object of class |
Details
galah_filter uses non-standard evaluation (NSE),
and is designed to be as compatible as possible with dplyr::filter()
syntax.
All statements passed to galah_filter() (except the profile
argument) take the form of field - logical - value. Permissible examples include:
-
=or==(e.g.year = 2020) -
!=, e.g.year != 2020) -
>or>=(e.g.year >= 2020) -
<or<=(e.g.year <= 2020) -
ORstatements (e.g.year == 2018 | year == 2020) -
ANDstatements (e.g.year >= 2000 & year <= 2020)
In some cases R will fail to parse inputs with a single equals sign
(=), particularly where statements are separated by & or
|. This problem can be avoided by using a double-equals (==) instead.
Notes on behaviour
Separating statements with a comma is equivalent to an AND statement;
Ergo galah_filter(year >= 2010 & year < 2020) is the same as
galah_filter(year >= 2010, year < 2020).
All statements must include the field name; so
galah_filter(year == 2010 | year == 2021) works, as does
galah_filter(year == c(2010, 2021)), but galah_filter(year == 2010 | 2021)
fails.
It is possible to use an object to specify required values, e.g.
year_value <- 2010; galah_filter(year > year_value)
solr supports range queries on text as well as numbers; so this is valid:
galah_filter(cl22 >= "Tasmania")
It is possible to filter by 'assertions', which are statements about data
validity, e.g. to remove those lacking critical spatial or taxonomic data:
galah_filter(assertions != c("INVALID_SCIENTIFIC_NAME", "COORDINATE_INVALID")
Valid assertions can be found using show_all(assertions).
Value
A tibble containing filter values.
See Also
search_taxa() and galah_geolocate() for other ways to restrict
the information returned by atlas_occurrences() and related functions. Use
search_all(fields) to find fields that
you can filter by, and show_values() to find what values
of those filters are available.
Examples
## Not run:
# Filter query results to return records of interest
galah_call() |>
galah_filter(year >= 2019,
basisOfRecord == "HumanObservation") |>
atlas_counts()
# Alternatively, the same call using `dplyr` functions:
request_data() |>
filter(year >= 2019,
basisOfRecord == "HumanObservation") |>
count() |>
collect()
## End(Not run)