| format_data {popbayes} | R Documentation |
Format count series
Description
This function provides an easy way to get count series ready to be analyzed
by the package popbayes. It must be used prior to all other functions.
This function formats the count series (passed through the argument
data) by selecting and renaming columns, checking columns format and
content, and removing missing data (if na_rm = TRUE). It converts the
original data frame into a list of count series that will be analyzed later
by the function fit_trend() to estimate population trends.
To be usable for the estimation of population trends, counts must be
accompanied by information on precision. The population trend model requires
a 95% confident interval (CI).
If estimates are total counts or guesstimates, this function will construct
boundaries of the 95% CI by applying the rules set out in
https://frbcesab.github.io/popbayes/articles/popbayes.html.
If counts were estimated by a sampling method the user needs to specify a
measure of precision. Precision is preferably provided in the form of a 95%
CI by means of two fields: lower_ci and upper_ci. It may also be given
in the form of a standard deviation (sd), a variance (var), or a
coefficient of variation (cv). If the fields lower_ci and upper_ci are
both absent (or NA), fields sd, var, and cv are examined in this
order. When one is found valid (no missing value), a 95% CI is derived
assuming a normal distribution.
The field stat_method must be present in data to indicate
if counts are total counts ('T'), sampling ('S'), or
guesstimate ('X').
If a series mixes aerial and ground counts, a field field_method must
also be present and must contain either 'A' (aerial counts), or 'G'
(ground counts). As all counts must eventually refer to the same field
method for a correct estimation of trend, a conversion will be performed to
homogenize counts. This conversion is based on a preferred field method
and a conversion factor both specific to a species/category.
The preferred field method specifies the conversion direction. The
conversion factor is the multiplicative factor that must be applied to an
aerial count to get an equivalent ground count (note that if the preferred
field method is 'A', ground counts will be divided by the conversion
factor to get the equivalent aerial count).
The argument rmax represents the maximum change in log population size
between two dates (i.e. the relative rate of increase). It will be used
by fit_trend() but must be provided in this function.
These three parameters, named pref_field_method, conversion_A2G, and
rmax can be present in data or in a second data.frame
(passed through the argument info).
Alternatively, the package popbayes provides their values for some
African large mammals.
Note: If the field field_method is absent in data, counts are
assumed to be obtained with one field method.
Usage
format_data(
data,
info = NULL,
date = "date",
count = "count",
location = "location",
species = "species",
stat_method = "stat_method",
lower_ci = "lower_ci",
upper_ci = "upper_ci",
sd = NULL,
var = NULL,
cv = NULL,
field_method = NULL,
pref_field_method = NULL,
conversion_A2G = NULL,
rmax = NULL,
path = ".",
na_rm = FALSE
)
Arguments
data |
a The If individual counts were estimated by sampling, additional column(s)
providing a measure of precision is also required (e.g. If the individuals were counted by different methods, an additional field
Others fields can be present either in |
info |
(optional) a |
date |
a |
count |
a |
location |
a |
species |
a |
stat_method |
a |
lower_ci |
(optional) a |
upper_ci |
(optional) a |
sd |
(optional) a |
var |
(optional) a |
cv |
(optional) a |
field_method |
(optional) a |
pref_field_method |
(optional) a |
conversion_A2G |
(optional) a |
rmax |
(optional) a |
path |
a |
na_rm |
a |
Value
An n-elements list (where n is the number of count series). The
name of each element of this list is a combination of location and
species. Each element of the list is a list with the following content:
-
locationacharacterstring. The name of the series site. -
speciesacharacterstring. The name of the series species. -
dateanumericalvector. The sequence of dates of the series. -
n_datesaninteger.The number of unique dates. -
stat_methodsacharactervector. The different stat methods of the series. -
field_methods(optional) acharactervector. The different field methods of the series. -
pref_field_method(optional) acharacterstring. The preferred field method of the species ('A'or'G'). -
conversion_A2G(optional) anumeric. The conversion factor of the species used to convert counts to its preferred field method. -
rmaxanumeric. The maximum population growth rate of the species. -
data_originaladata.frame. Original data of the series with renamed columns. Some rows may have been deleted (ifna_rm = TRUE). -
data_convertedadata.frame. Data containing computed boundaries of the 95% CI (lower_ci_convandupper_ci_conv). If counts have been obtained by different field methods, contains also converted counts (count_conv) based on the preferred field method and conversion factor of the species. Thisdata.framewill be used by the functionfit_trend()to fit population models.
Note: Some original series can be discarded if one of these two conditions is met: 1) the series contains only zero counts, and 2) the series contains only a few dates (< 4 dates).
Examples
## Load Garamba raw dataset ----
file_path <- system.file("extdata", "garamba_survey.csv",
package = "popbayes")
garamba <- read.csv(file = file_path)
## Create temporary folder ----
temp_path <- tempdir()
## Format dataset ----
garamba_formatted <- popbayes::format_data(
data = garamba,
path = temp_path,
field_method = "field_method",
pref_field_method = "pref_field_method",
conversion_A2G = "conversion_A2G",
rmax = "rmax")
## Number of count series ----
length(garamba_formatted)
## Retrieve count series names ----
popbayes::list_series(path = temp_path)
## Print content of the first count series ----
names(garamba_formatted[[1]])
## Print original data ----
garamba_formatted[[1]]$"data_original"
## Print converted data ----
garamba_formatted[[1]]$"data_converted"