prepare_data {bbsBayes} R Documentation

## Wrangle data to use for modelling input

### Description

prepare_data subsets raw BBS data by selected species and and wrangles stratified data for use as input for models.

### Usage

prepare_data(
strat_data = NULL,
species_to_run = NULL,
model = NULL,
heavy_tailed = FALSE,
n_knots = NULL,
min_year = NULL,
max_year = NULL,
min_n_routes = 3,
min_max_route_years = 3,
min_mean_route_years = 1,
strata_rem = NULL,
quiet = FALSE,
sampler = "jags",
basis = "original",
...
)


### Arguments

 strat_data Large list of stratified data returned by stratify() species_to_run Character string of the English name of the species to run model Character string of model to be used. Options are "slope", "firstdiff", "gam", "gamye. heavy_tailed Logical indicating whether the extra-Poisson error distribution should be modeled as a t-distribution, with heavier tails than the standard normal distribution. Default is currently FALSE, but recent results suggest users should strongly consider setting this to TRUE, even though it requires much longer convergence times n_knots Number of knots to be used in GAM function min_year Minimum year to keep in analysis max_year Maximum year to keep in analysis min_n_routes Minimum routes per strata where species has been observed. Defaults to 3 min_max_route_years Minimum number of years with non-zero observations of species on at least 1 route. Defaults to 3 min_mean_route_years Minimum average of years per route with the species observed. Defaults to 1. strata_rem Strata to remove from analysis. Defaults to NULL quiet Should progress bars be suppressed? sampler Which MCMC sampling software to use. Currently bbsBayes only supports "jags". basis Which version of the basis-function to use for the GAM smooth, the default is "original" the same basis used in Smith and Edwards 2020 and "mgcv" is an alternate that uses the "tp" basis from the packages mgcv (also used in brms, and rstanarm). If using the "mgcv" option, the user may want to consider adjusting the prior distributions for the parameters and their precision ... Additional arguments

### Value

List of data to be used for modelling, including:

 model The model to be used heavy_tailed Logical indicating whether the extra-Poisson error distribution should be modeled as a t-distribution min_nu if heavy_tailed is TRUE, minimum value for truncated gamma on DF of t-distribution noise default is 0 and user must change manually after function is run ncounts The number of counts containing useful data for the species nstrata The number of strata used in the analysis ymin Minimum year used ymax Maximum year used nonzeroweight Proportion of routes in each strata with species obervation count Vector of counts for the species strat Vector of strata to be used in the analysis obser Vector of unique observer-route pairings year Vector of years for each count firstyr Vector of indicator variables as to whether an observer was a first year month vector of numeric month of observation day vector of numeric day of observation nobservers Total number of observer-route pairings fixedyear Median of all years (ymin:ymax), included only with slope and firstdiff models nknots Number of knots to use for smooting functions, included only with GAM X.basis Basis function for n smoothing functions, included only with GAM

### Examples

# Toy example with Pacific Wren sample data
# First, stratify the sample data

strat_data <- stratify(by = "bbs_cws", sample_data = TRUE)

# Prepare the stratified data for use in a model. In this
#   toy example, we will set the minimum year as 2009 and
#   maximum year as 2018, effectively only setting up to
#   model 10 years of data. We will use the "first difference
#   model.
model_data <- prepare_data(strat_data = strat_data,
species_to_run = "Pacific Wren",
model = "firstdiff",
min_year = 2009,
max_year = 2018)

# You can also specify the GAM model, with an optional number of
# knots to use for the GAM basis.
# By default, the number of knots will be equal to the floor
# of the total unique years for the species / 4
model_data <- prepare_data(strat_data = strat_data,
species_to_run = "Pacific Wren",
model = "gam",
n_knots = 9)



[Package bbsBayes version 2.5.2 Index]