galah_filter {galah} | R Documentation |
Narrow a query by specifying filters
Description
"Filters" are arguments of the form field
logical
value
that are used
to narrow down the number of records returned by a specific query.
For example, it is common for users to request records from a particular year
(year == 2020
), or to return all records except for fossils
(basisOfRecord != "FossilSpecimen"
).
The result of galah_filter()
can be passed to the filter
argument in atlas_occurrences()
, atlas_species()
,
atlas_counts()
or atlas_media()
.
Usage
galah_filter(..., profile = NULL)
## S3 method for class 'data_request'
filter(.data, ...)
## S3 method for class 'metadata_request'
filter(.data, ...)
## S3 method for class 'files_request'
filter(.data, ...)
Arguments
... |
filters, in the form |
profile |
|
.data |
An object of class |
Details
galah_filter
uses non-standard evaluation (NSE),
and is designed to be as compatible as possible with dplyr::filter()
syntax.
All statements passed to galah_filter()
(except the profile
argument) take the form of field - logical - value. Permissible examples include:
-
=
or==
(e.g.year = 2020
) -
!=
, e.g.year != 2020
) -
>
or>=
(e.g.year >= 2020
) -
<
or<=
(e.g.year <= 2020
) -
OR
statements (e.g.year == 2018 | year == 2020
) -
AND
statements (e.g.year >= 2000 & year <= 2020
)
In some cases R
will fail to parse inputs with a single equals sign
(=
), particularly where statements are separated by &
or
|
. This problem can be avoided by using a double-equals (==
) instead.
Notes on behaviour
Separating statements with a comma is equivalent to an AND
statement;
Ergo galah_filter(year >= 2010 & year < 2020)
is the same as
galah_filter(year >= 2010, year < 2020)
.
All statements must include the field name; so
galah_filter(year == 2010 | year == 2021)
works, as does
galah_filter(year == c(2010, 2021))
, but galah_filter(year == 2010 | 2021)
fails.
It is possible to use an object to specify required values, e.g.
year_value <- 2010; galah_filter(year > year_value)
solr
supports range queries on text as well as numbers; so this is valid:
galah_filter(cl22 >= "Tasmania")
It is possible to filter by 'assertions', which are statements about data
validity, e.g. to remove those lacking critical spatial or taxonomic data:
galah_filter(assertions != c("INVALID_SCIENTIFIC_NAME", "COORDINATE_INVALID")
Valid assertions can be found using show_all(assertions)
.
Value
A tibble containing filter values.
See Also
search_taxa()
and galah_geolocate()
for other ways to restrict
the information returned by atlas_occurrences()
and related functions. Use
search_all(fields)
to find fields that
you can filter by, and show_values()
to find what values
of those filters are available.
Examples
## Not run:
# Filter query results to return records of interest
galah_call() |>
galah_filter(year >= 2019,
basisOfRecord == "HumanObservation") |>
atlas_counts()
# Alternatively, the same call using `dplyr` functions:
request_data() |>
filter(year >= 2019,
basisOfRecord == "HumanObservation") |>
count() |>
collect()
## End(Not run)