occData2timeList {paleotree} | R Documentation |
Converting Occurrences Data to a timeList
Data Object
Description
This function converts occurrence data, given as a list where each element
is a different taxon's occurrence table (containing minimum and maximum ages
for each occurrence), to the timeList
format, consisting of a list composed
of a matrix of lower and upper age bounds for intervals, and a second matrix
recording the interval in which taxa first and last occur in the given dataset.
Usage
occData2timeList(occList, intervalType = "dateRange")
Arguments
occList |
A list where every element is a table of occurrence data for a different taxon,
such as that returned by |
intervalType |
Must be either |
Details
This function should translate taxon-sorted occurrence data, which could be Paleobiology Database
datasets sorted by taxonSortPBDBocc
or any data object where occurrence data
(i.e. age bounds for each occurrence) for different taxa is separated into different elements
of a named list.
The Usage of the Argument intervalType
The argument intervalType
controls the algorithm used for obtain first and last interval bounds for
each taxon, of which there are several options for intervalType
to select from:
"dateRange"
The default option. The bounds on the first appearances are the span between the oldest upper and lower bounds of the occurrences, and the bounds on the last appearances are the span between the youngest upper and lower bounds across all occurrences. This is guaranteed to provide the smallest bounds on the first and last appearances, and was originally suggested to the author by J. Marcot.
"occRange"
This option returns the smallest bounds among (a) the oldest occurrences for the first appearance (i.e. all occurrences with their lowest bound at the oldest lower age bound), and (b) the youngest occurrences for the last appearance (i.e. all occurrences with their uppermost bound at the youngest upper age bound).
"zoneOverlap"
This option is an attempt to mimic the stratigraphic range algorithm used by PBDB Classic which "finds the oldest base that is older than at least part of all the intervals and the youngest that is younger than at least part of all the intervals" (personal communication, J. Alroy). This is a somewhat more complex case as we are trying to obtain a
timeList
object. So, for calculating the bounds of the first interval a taxon occurs in, thezoneOverlap
algorithm looks for all occurrences that overlap with the age range of the earliest-most occurrence and (1) obtains their earliest boundary ages and returns the latest-most earliest age boundary among these overlapping occurrences and (2) obtains their latest boundary ages and returns the earliest-most latest age boundary among these overlapping occurrences. Similarly, for calculating the bound of the last interval a taxon occurs in, thezoneOverlap
algorithm looks for all occurrences that overlap with the age range of the latest-most occurrence and (1) obtains their earliest boundary ages and returns the latest-most earliest age boundary among these overlapping occurrences and (2) obtains their latest boundary ages and returns the earliest-most latest age boundary among these overlapping occurrences.On theoretical grounds, one could probably describe the zone-of-overlap algorithm as minimizing taxonomic age ranges by assuming that all overlapping occurrences at the start and end of a taxon's range probably describe a very similar first and last appearance (FADs and LADs), and thus picks the occurrence with bounds that extends the taxonomic range the least. However, this does come with a downside that if these occurrences are not essentially repeated attempts to capture the same FAD or LAD, then the zone-of-overlap algorithm is not an accurate depiction of the uncertainty in the ages. The true biological range of a taxon might be well outside the bounds obtained using the zone-of-overlap algorithm. A more conservative approach is the
"dateRange"
algorithm which finds the smallest possible bounds on the endpoints of a taxon's range without ignoring uncertainty from any particular set of occurrences.
Value
Returns a standard timeList
data object, as used by
many other paleotree
functions, like
bin_timePaleoPhy
, bin_cal3TimePaleoPhy
and taxicDivDisc
Author(s)
David W. Bapst, with the "dateRange"
algorithm suggested by Jon Marcot.
See Also
Occurrence data as commonly used with paleotree
functions can
be obtained with link{getPBDBocc}
, and sorted into taxa by
taxonSortPBDBocc
, and further explored with this function and
plotOccData
. Also, see the example graptolite dataset
at graptPBDB
Examples
data(graptPBDB)
graptOccSpecies <- taxonSortPBDBocc(
data = graptOccPBDB,
rank = "species",
onlyFormal = FALSE)
graptTimeSpecies <- occData2timeList(occList = graptOccSpecies)
head(graptTimeSpecies[[1]])
head(graptTimeSpecies[[2]])
graptOccGenus <- taxonSortPBDBocc(
data = graptOccPBDB,
rank = "genus",
onlyFormal = FALSE
)
graptTimeGenus <- occData2timeList(occList = graptOccGenus)
layout(1:2)
taxicDivDisc(graptTimeSpecies)
taxicDivDisc(graptTimeGenus)
# the default interval calculation is "dateRange"
# let's compare to the other option, "occRange"
# but now for graptolite *species*
graptOccRange <- occData2timeList(
occList = graptOccSpecies,
intervalType = "occRange"
)
#we would expect no change in the diversity curve
#because there are only changes in th
#earliest bound for the FAD
#latest bound for the LAD
#so if we are depicting ranges within maximal bounds
#dateRanges has no effect
layout(1:2)
taxicDivDisc(graptTimeSpecies)
taxicDivDisc(graptOccRange)
#yep, identical!
#so how much uncertainty was gained by using dateRange?
# write a function for getting uncertainty in first and last
# appearance dates from a timeList object
sumAgeUncert <- function(timeList){
fourDate <- timeList2fourDate(timeList)
perOcc <- (fourDate[,1] - fourDate[,2]) +
(fourDate[,3] - fourDate[,4])
sum(perOcc)
}
#total amount of uncertainty in occRange dataset
sumAgeUncert(graptOccRange)
#total amount of uncertainty in dateRange dataset
sumAgeUncert(graptTimeSpecies)
#the difference
sumAgeUncert(graptOccRange) - sumAgeUncert(graptTimeSpecies)
#as a proportion
1 - (sumAgeUncert(graptTimeSpecies) / sumAgeUncert(graptOccRange))
#a different way of doing it
dateChange <- timeList2fourDate(graptTimeSpecies) -
timeList2fourDate(graptOccRange)
apply(dateChange, 2, sum)
#total amount of uncertainty removed by dateRange algorithm
sum(abs(dateChange))
layout(1)