format_data {popbayes} | R Documentation |
Format count series
Description
This function provides an easy way to get count series ready to be analyzed
by the package popbayes
. It must be used prior to all other functions.
This function formats the count series (passed through the argument
data
) by selecting and renaming columns, checking columns format and
content, and removing missing data (if na_rm = TRUE
). It converts the
original data frame into a list of count series that will be analyzed later
by the function fit_trend()
to estimate population trends.
To be usable for the estimation of population trends, counts must be
accompanied by information on precision. The population trend model requires
a 95% confident interval (CI).
If estimates are total counts or guesstimates, this function will construct
boundaries of the 95% CI by applying the rules set out in
https://frbcesab.github.io/popbayes/articles/popbayes.html.
If counts were estimated by a sampling method the user needs to specify a
measure of precision. Precision is preferably provided in the form of a 95%
CI by means of two fields: lower_ci
and upper_ci
. It may also be given
in the form of a standard deviation (sd
), a variance (var
), or a
coefficient of variation (cv
). If the fields lower_ci
and upper_ci
are
both absent (or NA
), fields sd
, var
, and cv
are examined in this
order. When one is found valid (no missing value), a 95% CI is derived
assuming a normal distribution.
The field stat_method
must be present in data
to indicate
if counts are total counts ('T'
), sampling ('S'
), or
guesstimate ('X'
).
If a series mixes aerial and ground counts, a field field_method
must
also be present and must contain either 'A'
(aerial counts), or 'G'
(ground counts). As all counts must eventually refer to the same field
method for a correct estimation of trend, a conversion will be performed to
homogenize counts. This conversion is based on a preferred field method
and a conversion factor both specific to a species/category.
The preferred field method specifies the conversion direction. The
conversion factor is the multiplicative factor that must be applied to an
aerial count to get an equivalent ground count (note that if the preferred
field method is 'A'
, ground counts will be divided by the conversion
factor to get the equivalent aerial count).
The argument rmax
represents the maximum change in log population size
between two dates (i.e. the relative rate of increase). It will be used
by fit_trend()
but must be provided in this function.
These three parameters, named pref_field_method
, conversion_A2G
, and
rmax
can be present in data
or in a second data.frame
(passed through the argument info
).
Alternatively, the package popbayes
provides their values for some
African large mammals.
Note: If the field field_method
is absent in data
, counts are
assumed to be obtained with one field method.
Usage
format_data(
data,
info = NULL,
date = "date",
count = "count",
location = "location",
species = "species",
stat_method = "stat_method",
lower_ci = "lower_ci",
upper_ci = "upper_ci",
sd = NULL,
var = NULL,
cv = NULL,
field_method = NULL,
pref_field_method = NULL,
conversion_A2G = NULL,
rmax = NULL,
path = ".",
na_rm = FALSE
)
Arguments
data |
a The If individual counts were estimated by sampling, additional column(s)
providing a measure of precision is also required (e.g. If the individuals were counted by different methods, an additional field
Others fields can be present either in |
info |
(optional) a |
date |
a |
count |
a |
location |
a |
species |
a |
stat_method |
a |
lower_ci |
(optional) a |
upper_ci |
(optional) a |
sd |
(optional) a |
var |
(optional) a |
cv |
(optional) a |
field_method |
(optional) a |
pref_field_method |
(optional) a |
conversion_A2G |
(optional) a |
rmax |
(optional) a |
path |
a |
na_rm |
a |
Value
An n-elements list
(where n
is the number of count series). The
name of each element of this list is a combination of location and
species. Each element of the list is a list
with the following content:
-
location
acharacter
string. The name of the series site. -
species
acharacter
string. The name of the series species. -
date
anumerical
vector. The sequence of dates of the series. -
n_dates
aninteger.
The number of unique dates. -
stat_methods
acharacter
vector. The different stat methods of the series. -
field_methods
(optional) acharacter
vector. The different field methods of the series. -
pref_field_method
(optional) acharacter
string. The preferred field method of the species ('A'
or'G'
). -
conversion_A2G
(optional) anumeric
. The conversion factor of the species used to convert counts to its preferred field method. -
rmax
anumeric
. The maximum population growth rate of the species. -
data_original
adata.frame
. Original data of the series with renamed columns. Some rows may have been deleted (ifna_rm = TRUE
). -
data_converted
adata.frame
. Data containing computed boundaries of the 95% CI (lower_ci_conv
andupper_ci_conv
). If counts have been obtained by different field methods, contains also converted counts (count_conv
) based on the preferred field method and conversion factor of the species. Thisdata.frame
will be used by the functionfit_trend()
to fit population models.
Note: Some original series can be discarded if one of these two conditions is met: 1) the series contains only zero counts, and 2) the series contains only a few dates (< 4 dates).
Examples
## Load Garamba raw dataset ----
file_path <- system.file("extdata", "garamba_survey.csv",
package = "popbayes")
garamba <- read.csv(file = file_path)
## Create temporary folder ----
temp_path <- tempdir()
## Format dataset ----
garamba_formatted <- popbayes::format_data(
data = garamba,
path = temp_path,
field_method = "field_method",
pref_field_method = "pref_field_method",
conversion_A2G = "conversion_A2G",
rmax = "rmax")
## Number of count series ----
length(garamba_formatted)
## Retrieve count series names ----
popbayes::list_series(path = temp_path)
## Print content of the first count series ----
names(garamba_formatted[[1]])
## Print original data ----
garamba_formatted[[1]]$"data_original"
## Print converted data ----
garamba_formatted[[1]]$"data_converted"