ds {Distance}  R Documentation 
This function fits detection functions to line or point transect data and
then (provided that survey information is supplied) calculates abundance and
density estimates. The examples below illustrate some basic types of
analysis using ds()
.
ds( data, truncation = ifelse(is.null(cutpoints), ifelse(is.null(data$distend), max(data$distance), max(data$distend)), max(cutpoints)), transect = "line", formula = ~1, key = c("hn", "hr", "unif"), adjustment = c("cos", "herm", "poly"), order = NULL, scale = c("width", "scale"), cutpoints = NULL, dht.group = FALSE, monotonicity = ifelse(formula == ~1, "strict", "none"), region.table = NULL, sample.table = NULL, obs.table = NULL, convert.units = 1, er.var = ifelse(transect == "line", "R2", "P3"), method = "nlminb", quiet = FALSE, debug.level = 0, initial.values = NULL, max.adjustments = 5, er.method = 2, dht.se = TRUE )
data 
a 
truncation 
either truncation distance (numeric, e.g. 5) or percentage
(as a string, e.g. "15%"). Can be supplied as a 
transect 
indicates transect type "line" (default) or "point". 
formula 
formula for the scale parameter. For a CDS analysis leave
this as its default 
key 
key function to use; 
adjustment 
adjustment terms to use; 
order 
orders of the adjustment terms to fit (as a vector/scalar), the
default value ( 
scale 
the scale by which the distances in the adjustment terms are
divided. Defaults to 
cutpoints 
if the data are binned, this vector gives the cutpoints of
the bins. Ensure that the first element is 0 (or the left truncation
distance) and the last is the distance to the end of the furthest bin.
(Default 
dht.group 
should density abundance estimates consider all groups to
be size 1 (abundance of groups) 
monotonicity 
should the detection function be constrained for
monotonicity weakly ( 
region.table 

sample.table 

obs.table 

convert.units 
conversion between units for abundance estimation, see "Units", below. (Defaults to 1, implying all of the units are "correct" already.) 
er.var 
encounter rate variance estimator to use when abundance
estimates are required. Defaults to "R2" for line transects and "P3" for
point transects. See 
method 
optimization method to use (any method usable by

quiet 
suppress nonessential messages (useful for bootstraps etc).
Default value 
debug.level 
print debugging output. 
initial.values 
a 
max.adjustments 
maximum number of adjustments to try (default 5) only
used when 
er.method 
encounter rate variance calculation: default = 2 gives the method of Innes et al, using expected counts in the encounter rate. Setting to 1 gives observed counts (which matches Distance for Windows) and 0 uses binomial variance (only useful in the rare situation where study area = surveyed area). See 
dht.se 
should uncertainty be calculated when using 
a list with elements:
ddf
a detection function model object.
dht
abundance/density information (if survey region data was supplied,
else NULL
)
If abundance estimates are required then the data.frame
s region.table
and sample.table
must be supplied. If data
does not contain the columns
Region.Label
and Sample.Label
then the data.frame
obs.table
must
also be supplied. Note that stratification only applies to abundance
estimates and not at the detection function level.
For more advanced abundance/density estimation please see the
dht
and dht2
functions.
Examples of distance sampling analyses are available at http://examples.distancesampling.org/.
Hints and tips on fitting (particularly optimisation issues) are on the
mrdsopt
manual page.
Note that if the data contains a column named size
, cluster size will be
estimated and density/abundance will be based on a clustered analysis of
the data. Setting this column to be NULL
will perform a nonclustered
analysis (for example if "size
" means something else in your dataset).
The right truncation point is by default set to be largest observed distance or bin end point. This is a default will not be appropriate for all data and can often be the cause of model convergence failures. It is recommended that one plots a histogram of the observed distances prior to model fitting so as to get a feel for an appropriate truncation distance. (Similar arguments go for left truncation, if appropriate). Buckland et al (2001) provide guidelines on truncation.
When specified as a percentage, the largest right
and smallest left
percent distances are discarded. Percentages cannot be supplied when using
binned data.
For left truncation, there are two options: (1) fit a detection function to
the truncated data as is (this is what happens when you set left
). This
does not assume that g(x)=1 at the truncation point. (2) manually remove
data with distances less than the left truncation distance – effectively
move the centre line out to be the truncation distance (this needs to be
done before calling ds
). This then assumes that detection is certain at
the left truncation distance. The former strategy has a weaker assumption,
but will give higher variance as the detection function close to the line
has no data to tell it where to fit – it will be relying on the data from
after the left truncation point and the assumed shape of the detection
function. The latter is most appropriate in the case of aerial surveys,
where some area under the plane is not visible to the observers, but their
probability of detection is certain at the smallest distance.
Note that binning is performed such that bin 1 is all distances greater or equal to cutpoint 1 (>=0 or left truncation distance) and less than cutpoint 2. Bin 2 is then distances greater or equal to cutpoint 2 and less than cutpoint 3 and so on.
When adjustment terms are used, it is possible for the detection function to not always decrease with increasing distance. This is unrealistic and can lead to bias. To avoid this, the detection function can be constrained for monotonicity (and is by default for detection functions without covariates).
Monotonicity constraints are supported in a similar way to that described
in Buckland et al (2001). 20 equally spaced points over the range of the
detection function (left to right truncation) are evaluated at each round
of the optimisation and the function is constrained to be either always
less than it's value at zero ("weak"
) or such that each value is
less than or equal to the previous point (monotonically decreasing;
"strict"
). See also check.mono
.
Even with no monotonicity constraints, checks are still made that the
detection function is monotonic, see check.mono
.
In extrapolating to the entire survey region it is important that the unit
measurements be consistent or converted for consistency. A conversion
factor can be specified with the convert.units
variable. The values of
Area
in region.table
, must be made consistent with the units for
Effort
in sample.table
and the units of distance
in the data.frame
that was analyzed. It is easiest if the units of Area
are the square of
the units of Effort
and then it is only necessary to convert the units of
distance
to the units of Effort
. For example, if Effort
was entered
in kilometres and Area
in square kilometres and distance
in metres then
using convert.units=0.001
would convert metres to kilometres, density
would be expressed in square kilometres which would then be consistent with
units for Area
. However, they can all be in different units as long as
the appropriate composite value for convert.units
is chosen. Abundance
for a survey region can be expressed as: A*N/a
where A
is Area
for
the survey region, N
is the abundance in the covered (sampled) region,
and a
is the area of the sampled region and is in units of Effort * distance
. The sampled region a
is multiplied by convert.units
, so it
should be chosen such that the result is in the same units as Area
. For
example, if Effort
was entered in kilometres, Area
in hectares (100m x
100m) and distance
in metres, then using convert.units=10
will convert
a
to units of hectares (100 to convert metres to 100 metres for distance
and .1 to convert km to 100m units).
One can supply data
only to simply fit a detection function. However, if
abundance/density estimates are necessary further information is required.
Either the region.table
, sample.table
and obs.table
data.frame
s can
be supplied or all data can be supplied as a "flat file" in the data
argument. In this format each row in data has additional information that
would ordinarily be in the other tables. This usually means that there are
additional columns named: Sample.Label
, Region.Label
, Effort
and
Area
for each observation. See flatfile
for an example.
If column Area
is omitted, a density estimate is generated but note that
the degrees of freedom/standard errors/confidence intervals will not match
density estimates made with the Area
column present.
David L. Miller
Buckland, S.T., Anderson, D.R., Burnham, K.P., Laake, J.L., Borchers, D.L., and Thomas, L. (2001). Distance Sampling. Oxford University Press. Oxford, UK.
Buckland, S.T., Anderson, D.R., Burnham, K.P., Laake, J.L., Borchers, D.L., and Thomas, L. (2004). Advanced Distance Sampling. Oxford University Press. Oxford, UK.
flatfile
, AIC.ds
, ds.gof
,
p_dist_table
, plot.ds
,
add_df_covar_line
# An example from mrds, the golf tee data. library(Distance) data(book.tee.data) tee.data < subset(book.tee.data$book.tee.dataframe, observer==1) ds.model < ds(tee.data, 4) summary(ds.model) plot(ds.model) ## Not run: # same model, but calculating abundance # need to supply the region, sample and observation tables region < book.tee.data$book.tee.region samples < book.tee.data$book.tee.samples obs < book.tee.data$book.tee.obs ds.dht.model < ds(tee.data, 4, region.table=region, sample.table=samples, obs.table=obs) summary(ds.dht.model) # specify order 2 cosine adjustments ds.model.cos2 < ds(tee.data, 4, adjustment="cos", order=2) summary(ds.model.cos2) # specify order 2 and 3 cosine adjustments, turning monotonicity # constraints off ds.model.cos23 < ds(tee.data, 4, adjustment="cos", order=c(2, 3), monotonicity=FALSE) # check for nonmonotonicity  actually no problems check.mono(ds.model.cos23$ddf, plot=TRUE, n.pts=100) # include both a covariate and adjustment terms in the model ds.model.cos2.sex < ds(tee.data, 4, adjustment="cos", order=2, monotonicity=FALSE, formula=~as.factor(sex)) # check for nonmonotonicity  actually no problems check.mono(ds.model.cos2.sex$ddf, plot=TRUE, n.pts=100) # truncate the largest 10% of the data and fit only a hazardrate # detection function ds.model.hr.trunc < ds(tee.data, truncation="10%", key="hr", adjustment=NULL) summary(ds.model.hr.trunc) # compare AICs between these models: AIC(ds.model) AIC(ds.model.cos2) AIC(ds.model.cos23) ## End(Not run)