filter_repeat_visits {auk} | R Documentation |
Filter observations to repeat visits for hierarchical modeling
Description
Hierarchical modeling of abundance and occurrence requires repeat visits to sites to estimate detectability. These visits should be all be within a period of closure, i.e. when the population can be assumed to be closed. eBird data, and many other data sources, do not explicitly follow this protocol; however, subsets of the data can be extracted to produce data suitable for hierarchical modeling. This function extracts a subset of observation data that have a desired number of repeat visits within a period of closure.
Usage
filter_repeat_visits(
x,
min_obs = 2L,
max_obs = 10L,
annual_closure = TRUE,
n_days = NULL,
date_var = "observation_date",
site_vars = c("locality_id", "observer_id"),
ll_digits = 6L
)
Arguments
x |
|
min_obs |
integer; minimum number of observations required for each site. |
max_obs |
integer; maximum number of observations allowed for each site. |
annual_closure |
logical; whether the entire year should be treated as
the period of closure (the default). This can be useful, for example, if
the data have been subset to a period of closure prior to calling
|
n_days |
integer; number of days defining the temporal length of
closure. If |
date_var |
character; column name of the variable in |
site_vars |
character; names of one of more columns in |
ll_digits |
integer; the number of digits to round latitude and longitude
to. If latitude and/or longitude are used as |
Details
In addition to specifying the minimum and maximum number of
observations per site, users must specify the variables in the dataset that
define a "site". This is typically a combination of IDs defining the
geographic site and the unique observer (repeat visits are meant to be
conducted by the same observer). Finally, the closure period must be
defined, which is a period within which the population of the focal species
can reasonably be assumed to be closed. This can be done using a
combination of the n_days
and annual_closure
arguments.
Value
A data.frame
filtered to only retain observations from sites with
the allowed number of observations within the period of closure. The
results will be sorted such that sites are together and in chronological
order. The following variables are added to the data frame:
-
site
: a unique identifier for each "site" corresponding to all the variables insite_vars
andclosure_id
concatenated together with underscore separators. -
closure_id
: a unique ID for each closure period. Ifannual_closure = TRUE
this ID will include the year. Ifn_days
is used an index given the number of blocks ofn_days
days since the earliest observation will be included. Note that in this case, there may be gaps in the IDs. -
n_observations
: number of observations at each site after all filtering.
See Also
Other modeling:
format_unmarked_occu()
Examples
# read and zero-fill the ebd data
f_ebd <- system.file("extdata/zerofill-ex_ebd.txt", package = "auk")
f_smpl <- system.file("extdata/zerofill-ex_sampling.txt", package = "auk")
# data must be for a single species
ebd_zf <- auk_zerofill(x = f_ebd, sampling_events = f_smpl,
species = "Collared Kingfisher",
collapse = TRUE)
filter_repeat_visits(ebd_zf, n_days = 30)