R: DAS sightings

das_sight {swfscDAS}

R Documentation

DAS sightings

Description

Extract sightings and associated information from processed DAS data

Usage

das_sight(x, ...)

## S3 method for class 'data.frame'
das_sight(x, ...)

## S3 method for class 'das_df'
das_sight(
  x,
  return.format = c("default", "wide", "complete"),
  return.events = c("S", "K", "M", "G", "s", "k", "m", "g", "t", "p", "F"),
  ...
)

Arguments

`x`	an object of class `das_df`, or a data frame that can be coerced to class `das_df`
`...`	ignored
`return.format`	character; can be one of "default", "wide", "complete", or any partial match thereof (case sensitive). Formats described below
`return.events`	character; event codes included in the output. Must be one or more of: "S", "K", "M", "G", "s", "k", "m", "g", "t", "p", "F" (case-sensitive). The default is all of these event codes

Details

DAS events contain specific information in the 'Data#' columns, with the information depending on the event code for that row. The output data frame contains columns with this specific information extracted to dedicated columns as described below. This function recognizes the following types of sightings: marine mammal sightings (event codes "S", "K", or "M"), marine mammal resights (codes "s", "k", "m"), marine mammal subgroup sightings (code "G"), marine mammal subgroup resights (code "g"), turtle sightings (code "t"), pinniped sightings (code "p"), and fishing vessel sightings (code "F"). Warnings are printed if all S, K, M, and G events (and only these events) are not followed by an A event and at least one numeric event. See das_format_pdf for more information about events and event formats. Of specific note - sperm whale sightings (species code 046) often contain additional estimates recorded as "C" events immediately following the S, A, and numeric events. Because these estimates are recorded as"C" events, they are NOT included in the das_sight calculations or output for any return.format

The return.events argument simply provides a shortcut for filtering the output of das_sight by event codes

Abbreviations used in output column names: Gs = group size, Sp = species, Nm = nautical mile, Perc = percentage, Prob = probable, GsSchool = school-level group size info

This function makes the following assumptions, and alterations to the raw DAS data:

"A" events immediately following an S/K/M/G event have the same sighting number (Data1 value) as the S/K/M/G event
The 'nSp' column is equivalent to the number of non-NA values across the 'Data5', 'Data6', 'Data7', and 'Data8' columns for the pertinent "A" event
The following data are coerced to a numeric using as.numeric: Bearing, Reticle, DistNm, Cue, Method, species percentages, and group sizes (including for t, p, and F events). Note that if there are any formatting errors and these data are not numeric, the function will likely print a warning message
The values for the following columns are capitalized using toupper: 'Birds', 'Photos', 'CalibSchool', 'PhotosAerial', 'Biopsy', 'TurtleAge', and 'TurtleCapt'

Value

Data frame with 1) the columns from x, excluding the 'Data#' columns, and 2) columns with sighting information extracted from 'Data#' columns. See das_format_pdf for more information the sighting information. If return.format is "default", then there is one row for each species of each sighting event; if return.format is "wide", then there is one row for each sighting event; if return.format is "complete", then there is one row for every group size estimate for each sighting event (excluding sperm whale "C" events - see the Details section).

The format-specific columns are described in their respective sections. The following sighting information columns are included in all return formats:

Sighting information	Column name	Notes
Sighting number	SightNo	Character
Subgroup code	Subgroup	Character
Daily sighting number	SightNoDaily	See below
Observer that made the sighting	Obs
Standard observer	ObsStd	Logical; `TRUE` if Obs is one of ObsL, Rec or ObsR, and `FALSE` otherwise
Bearing to the sighting	Bearing	Numeric; degrees, expected range 0 to 360
Number of reticle marks	Reticle	Numeric
Distance (nautical miles)	DistNm	Numeric
Sighting cue	Cue
Sighting method	Method
Photos of school?	Photos
Birds present with school?	Birds
Calibration school?	CalibSchool
Aerial photos taken?	PhotosAerial
Biopsy taken?	Biopsy
Probable sighting	Prob	Logical indicating if sighting has associated ? event; `NA` for non-S/K/M/G events
Number of species in sighting	nSp	`NA` for non-S/K/M/G events
Mixed species sighting	Mixed	Logical; `TRUE` if nSp > 1
Group size of school - best estimate	GsSchoolBest	See below
Group size of school - high estimate	GsSchoolHigh	See below
Group size of school - low estimate	GsSchoolLow	See below
Course (true heading) of school at resight	CourseSchool	`NA` for non-s/k/m events
Presence of associated JFR	TurtleJFR	`NA` for non-"t" events; JFR = jellyfish, floating debris, or red tide
Estimated turtle maturity	TurtleAge	`NA` for non-"t" events
Perpendicular distance (km) to sighting	PerpDistKm	Calculated via `(abs(sin(Bearingpi/180) DistNm) * 1.852)`

SightNoDaily is a running count of the number of S/K/M/G sightings that occurred on each day. It is formatted as 'YYYYMMDD'_'running count', e.g. "20050101_1".

The GsSchoolBest, GsSchoolHigh, and GsSchoolLow columns are either: 1) the arithmetic mean across observer estimates, for the "default" and "wide" formats, or 2) the individual observer estimates, for the "complete" format. Note that for non-"complete" formats, na.rm = TRUE is used when calculating the mean, and thus blank elements of estimates (but not the whole incomplete estimate) are ignored.

To convert the perpendicular distance back to nautical miles, one would divide PerpDistKm by 1.852

The "default" format output

This output data frame contains 'long' sighting data, meaning there is one row for each species of each sighting event. The GsSp... columns are calculated as follows: for each species and for each observer estimate, the best/high/low school size estimate is multiplied by the applicable species percent estimate. The values are grouped by species and then averaged to get single GsSpBest, GsSpHigh, and GsSpLow values for each species. (using mean with na.rm = TRUE)

Sighting information columns/formats present specifically in the "default" format output:

Sighting information	Column name	Notes
Species code	SpCode	Boat type or mammal, turtle, or pinniped species codes
Probable species code	SpCodeProb	Probable mammal species codes; `NA` if none or not applicable
Group size of species - best estimate	GsSpBest	The product of the arithmetic means of GsSchoolBest and the corresponding species percentage
Group size of species - high estimate	GsSpHigh	The product of the arithmetic means of GsSchoolHigh and the corresponding species percentage
Group size of species - low estimate	GsSpLow	The product of the arithmetic means of GsSchoolLow and the corresponding species percentage

Note that for the above calculations, the GsSchoolX value and corresponding species percentages were each averaged across observers, using na.rm = TRUE, before being multiplied to calculate GsSpX. For example, in the workflow: GsSpBest1 = mean(.data$Data2, na.rm = TRUE) * mean(.data$Data5, na.rm = TRUE)

The "wide" and "complete" format outputs

The "wide" and "complete" options have very similar columns in their output date frames. There are two main differences: 1) the "wide" format has one row for each sighting event, while the complete format has a row for every observer estimate for each sightings, and thus 2) in the "wide" format, all numeric information for which there are multiple observer estimates (school group size, species percentage, etc.) are averaged across estimated via an arithmetic mean (using mean with na.rm = TRUE)

With these formats, note that the species/type code and group size for turtle, pinniped, and boat sightings are in their own column

Sighting information columns present in the "wide" and "complete" format outputs:

Sighting information	Column name	Notes
Observer code - estimate	ObsEstimate	See below
Species 1 code	SpCode1
Species 2 code	SpCode2
Species 3 code	SpCode3
Species 4 code	SpCode4
Species 1 probable code	SpCodeProb1	Extracted from '?' event
Species 2 probable code	SpCodeProb2	Extracted from '?' event
Species 3 probable code	SpCodeProb3	Extracted from '?' event
Species 4 probable code	SpCodeProb4	Extracted from '?' event
Percentage of Sp 1 in school	SpPerc1
Percentage of Sp 2 in school	SpPerc2
Percentage of Sp 3 in school	SpPerc3
Percentage of Sp 4 in school	SpPerc4
Group size of species 1	GsSpBest1	Present in "wide" output only; see below
Group size of species 2	GsSpBest2	Present in "wide" output only; see below
Group size of species 3	GsSpBest3	Present in "wide" output only; see below
Group size of species 4	GsSpBest4	Present in "wide" output only; see below
Turtle species	TurtleSp	`NA` for non-"t" events
Turtle group size	TurtleGs	`NA` for non-"t" events
Was turtle captured?	TurtleCapt	`NA` for non-"t" events
Pinniped species	PinnipedSp	`NA` for non-"p" events
Pinniped group size	PinnipedGs	`NA` for non-"p" events
Boat or gear type	BoatType	`NA` for non-"F" events
Number of boats	BoatGs	`NA` for non-"F" events

ObsEstimate refers to the code of the observer that made the corresponding estimate. For the "wide" format, ObsEstimate is a list-column of all of the observer codes that provided an estimate. Also in the "wide" format, the GsSpBest# columns are the product of the means of GsSchoolBest and the corresponding species percentage (see the Default section for calculation details). These numbers, 1 to 4, correspond to the order of the data as it appears in the DAS file

Examples

y <- system.file("das_sample.das", package = "swfscDAS")
y.proc <- das_process(y)

das_sight(y.proc)
das_sight(y.proc, return.format = "complete")

[Package swfscDAS version 0.6.3 Index]