dht2 {Distance} | R Documentation |
Abundance estimation for distance sampling models
Description
Once a detection function is fitted to data, this function can be used to compute abundance estimates over required areas. The function also allows for stratification and variance estimation via various schemes (see below).
Usage
dht2(
ddf,
observations = NULL,
transects = NULL,
geo_strat = NULL,
flatfile = NULL,
strat_formula,
convert_units = 1,
er_est = c("R2", "P2"),
multipliers = NULL,
sample_fraction = 1,
ci_width = 0.95,
innes = FALSE,
stratification = "geographical",
total_area = NULL,
binomial_var = FALSE
)
Arguments
ddf |
model fitted by |
observations |
|
transects |
|
geo_strat |
|
flatfile |
data in the flatfile format, see |
strat_formula |
a formula giving the stratification structure (see "Stratification" below). Currently only one level of stratification is supported. |
convert_units |
conversion factor between units for the distances,
effort and area. See "Units" below. Can supply one per detection function in
|
er_est |
encounter rate variance estimator to be used. See "Variance"
below and |
multipliers |
|
sample_fraction |
proportion of the transect covered (e.g., 0.5 for
one-sided line transects). May be specified as either a single number or a
|
ci_width |
for use with confidence interval calculation (defined as 1-alpha, so the default 95 will give a 95% confidence interval). |
innes |
logical flag for computing encounter rate variance using either
the method of Innes et al (2002) where estimated abundance per transect
divided by effort is used as the encounter rate, vs. (when |
stratification |
what do strata represent, see "Stratification" below. |
total_area |
for options |
binomial_var |
if we wish to estimate abundance for the covered area
only (i.e., study area = surveyed area) then this must be set to be
|
Value
a data.frame
(of class dht_result
for pretty printing) with
estimates and attributes containing additional information, see "Outputs"
for information on column names.
Data
The data format allows for complex stratification schemes to be set-up. Three objects are always required:
-
ddf
the detection function (seeds
orddf
for information on the format of their inputs). -
observations
has one row per observation and links the observations to the transects. Required columns:-
object
(unique ID for the observation, which must match with the data in the detection function) -
Sample.Label
(unique ID for the transect). Additional columns for strata which are not included in the detection function are required (stratification covariates that are included in the detection function do not need to be included here). The important case here is group size, which must have column name
size
(but does not need to be in the detection function).
-
-
transects
has one row per sample (point or line transect). At least one row is required. Required columns:Sample.Label
(unique ID for the transect),Effort
(line length for line transects, number of visits for point transects), if there is more than one geographical stratum.
With only these three arguments, abundance can only be calculated for the covered area. Including additional information on the area we wish to extrapolate to (i.e., the study area), we can obtain abundance estimates:
-
geo_strat
has one row for each stratum that we wish to estimate abundance for. For abundance in the study area, at least one row is required. Required columns:Area
(the area of that stratum). If there is >1 row, then additional columns, named instrat_formula
.'
Note that if the Area
column is set to all 0, then only density estimates
will be returned.
Multipliers
It is often the case that we cannot measure distances to individuals or groups directly, but instead need to estimate distances to something they produce (e.g., for whales, their blows; for elephants their dung) – this is referred to as indirect sampling. We may need to use estimates of production rate and decay rate for these estimates (in the case of dung or nests) or just production rates (in the case of songbird calls or whale blows). We refer to these conversions between "number of cues" and "number of animals" as "multipliers".
The multipliers
argument is a list
, with 2 possible elements (creation
and decay
). Each element of which is a data.frame
and must have at least
a column named rate
, which abundance estimates will be divided by (the
term "multiplier" is a misnomer, but kept for compatibility with Distance
for Windows). Additional columns can be added to give the standard error and
degrees of freedom for the rate if known as SE
and df
, respectively. You
can use a multirow data.frame
to have different rates for different
geographical areas (for example). In this case the rows need to have a
column (or columns) to merge
with the data (for example Region.Label
).
Stratification
The strat_formula
argument is used to specify a column to use to stratify
the results, using the form ~column.name
where column.name
is the column
name you wish to use.
The stratification
argument is used to specify which of four types of
stratification are intended:
-
"geographical"
if each stratum represents a different geographical areas and you want the total over all the areas -
"effort_sum"
if your strata are in fact from replicate surveys (perhaps using different designs) but you don't have many replicates and/or want an estimate of "average variance" -
"replicate"
if you have replicate surveys but have many of them, this calculates the average abundance and the variance between those many surveys (think of a population of surveys) -
"object"
if the stratification is really about the type of object observed, for example sex, species or life stage and what you want is the total number of individuals across all the classes of objects. For example, if you have stratified by sex and have males and females, but also want a total number of animals, you should use this option.
A simple example of using stratification="geographical"
is given below.
Further examples can be found at http://examples.distancesampling.org/
(see, e.g., the deer pellet survey).
Variance
Variance in the estimated abundance comes from multiple sources. Depending on the data used to fit the model and estimate abundance, different components will be included in the estimated variances. In the simplest case, the detection function and encounter rate variance need to be combined. If group size varies, then this too must be included. Finally, if multipliers are used and have corresponding standard errors given, this are also included. Variances are combined by assuming independence between the measures and adding variances. A brief summary of how each component is calculated is given here, though see references for more details.
-
detection function: variance from the detection function parameters is transformed to variance about the abundance via a sandwich estimator (see e.g., Appendix C of Borchers et al (2002)).
-
encounter rate: for strata with >1 transect in them, the encounter rate estimators given in Fewster et al (2009) can be specified via the
er_est
argument. If the argumentinnes=TRUE
then calculations use the estimated number of individuals in the transect (rather than the observed), which was give by Innes et al (2002) as a superior estimator. When there is only one transect in a stratum, Poisson variance is assumed. Information on the Fewster encounter rate variance estimators are given invarn
-
group size: if objects occur in groups (sometimes "clusters"), then the empirical variance of the group sizes is added to the total variance.
-
multipliers: if multipliers with standard errors are given, their corresponding variances are added. If no standard errors are supplied, then their contribution to variance is assumed to be 0.
Units
It is often the case that distances are recorded in one convenient set of
units, whereas the study area and effort are recorded in some other units.
To ensure that the results from this function are in the expected units, we
use the convert_units
argument to supply a single number to convert the
units of the covered area to those of the study/stratification area (results
are always returned in the units of the study area). For line transects, the
covered area is calculated as 2 * width * length
where width
is the
effective (half)width of the transect (often referred to as w in the
literature) and length
is the line length (referred to as L). If width
and length
are measured in kilometres and the study area in square
kilometres, then all is fine and convert_units
is 1 (and can be ignored).
If, for example, line length and distances were measured in metres, we
instead need to convert this to be kilometres, by dividing by 1000 for each
of distance and length, hence convert_units=1e-6
. For point transects,
this is slightly easier as we only have the radius and study area to
consider, so the conversion is just such that the units of the truncation
radius are the square root of the study area units.
Output
On printing the output from call to dht2
, three tables are produced. Below is a guide to the output columns names, per table.
Summary statistics table
-
Region.Label
Stratum name (this first column name depends on theformula
supplied) -
Area
Size of stratum -
CoveredArea
Surveyed area in stratum (2 x w x L) -
Effort
Transect length or number of point visits per stratum -
n
Number of detections -
k
Number of replicate transects -
ER
Encounter rate -
se.ER
Standard error of encounter rate -
cv.ER
Coefficient of variation of encounter rate
-
Abundance or density estimates table:
-
Region.Label
As above -
Estimate
Point estimate of abundance or density -
se
Standard error -
cv
Coefficient of variation -
LCI
Lower confidence bound -
UCI
Upper confidence bound -
df
Degrees of freedom used for confidence interval computation
-
Components percentage of variance:
-
Region.Label
As above -
Detection
Percent of variance in abundance/density associated with detection function uncertainty -
ER
Percent of variance in abundance/density associated with variability in encounter rate -
Multipliers
Percent of variance in abundance/density associated with uncertainty in multipliers
-
References
Borchers, D.L., S.T. Buckland, P.W. Goedhart, E.D. Clarke, and S.L. Hedley. 1998. Horvitz-Thompson estimators for double-platform line transect surveys. Biometrics 54: 1221-1237.
Borchers, D.L., S.T. Buckland, and W. Zucchini. 2002 Estimating Animal Abundance: Closed Populations. Statistics for Biology and Health. Springer London.
Buckland, S.T., E.A. Rexstad, T.A. Marques, and C.S. Oedekoven. 2015 Distance Sampling: Methods and Applications. Methods in Statistical Ecology. Springer International Publishing.
Buckland, S.T., D.R. Anderson, K. Burnham, J.L. Laake, D.L. Borchers, and L. Thomas. 2001 Introduction to Distance Sampling: Estimating Abundance of Biological Populations. Oxford University Press.
Innes, S., M. P. Heide-Jorgensen, J.L. Laake, K.L. Laidre, H.J. Cleator, P. Richard, and R.E.A. Stewart. 2002 Surveys of belugas and narwhals in the Canadian high arctic in 1996. NAMMCO Scientific Publications 4, 169-190.
Examples
## Not run:
# example of simple geographical stratification
# minke whale data, with 2 strata: North and South
data(minke)
# first fitting the detection function
minke_df <- ds(minke, truncation=1.5, adjustment=NULL)
# now estimate abundance using dht2
# stratum labels are in the Region.Label column
minke_dht2 <- dht2(minke_df, flatfile=minke, stratification="geographical",
strat_formula=~Region.Label)
# could compare this to minke_df$dht and see the same results
minke_dht2
# can alternatively report density
print(minke_dht2, report="density")
## End(Not run)