flag_ranges {fossilbrush}R Documentation

flag_ranges

Description

Function to compare stratigraphic ranges in x to a set of reference ranges from y. A list of two elements is returned. The first is a dataframe summarising the overall error status, specific error counts FAD and LAD differences, and the 95% density distributions of the FAD and LAD errors for each unique taxon in the column of x denoted by the first element of xcols. If a taxon in x is not present in y, it is assigned the status 000 and its other entries in the returned dataframe will be NA. The second element of the returned list is the error code for every individual element of the column of x denoted by the first element of xcols - this will have the same number of rows as x. If x is a range table rather than an occurrence dataset, then the two list elements will have the same number of rows. Ranges for comparison may be supplied directly in y, or y may be another occurrence dataset, in which case

Usage

flag_ranges(
  x = NULL,
  y = NULL,
  xcols = c("genus", "max_ma", "min_ma"),
  ycols = NULL,
  flag.diff = 5,
  verbose = TRUE
)

Arguments

x

Stratigraphic range data for taxa as a whole or for individual fossil occurrences

y

The same as in x. This is the dataset to which ranges will be compared

xcols

A character vector of length three specifying, in the following order, the taxonomic name, stratigraphic base (FAD) and stratigraphic top (LAD) columns in x.

ycols

An optional character vector of length three for the same column types as in xcols, but for dataset y. This is useful if the column names differ between the datasets

flag.diff

A vector of thresholds, given in millions of years which will be used to flag discrepancies between occurrence FADs and LADs with respect to the reference range. This is a convenience parameter so that occurrences with large discrepancies can be quickly identified. Multiple thresholds can be supplied

verbose

A logical of length one determining if the flagging progress should be reported to the console

Value

A list of two data.frames, the first recording overall error statistics, the second recording error types for each element of x. In the second data.frame, FAD or LAD differences in excess of the supplied threshold(s) are marked with 1, otherwise 0

See Also

age_ranges is called internally to generate the range table for comparison.

Examples

# load the example datasets
data(brachios)
data(sepkoski)
# subsample brachios to make for a short example runtime
set.seed(1)
brachios <- brachios[sample(1:nrow(brachios), 1000),]
# update brachios to GTS2020 to match Sepkoski
brachios <- chrono_scale(brachios, srt = "early_interval", end = "late_interval",
                          max_ma = "max_ma", min_ma = "min_ma", verbose = FALSE)
brachios$max_ma <- brachios$newFAD
brachios$min_ma <- brachios$newLAD
# drop occurrences with older LADs than FADs
brachios <- brachios[brachios$max_ma > brachios$min_ma,]
# trim the Sepkoski Compendium to the relevant entries
sepkoski <- sepkoski[which(sepkoski$PHYLUM == "Brachiopoda"),]
# run flag ranges
flg <- flag_ranges(x = brachios, y = sepkoski, ycols = c("GENUS", "RANGE_BASE", "RANGE_TOP"))

[Package fossilbrush version 1.0.5 Index]