clean_dup_by_date {tenm} | R Documentation |
Function to thin occurrence data Cleans up duplicated longitude and latitude data by year using a specified distance threshold. The distance can be specified as a geographic distance or, if a raster_mask is provided, as a pixel distance.
Description
Function to thin occurrence data Cleans up duplicated longitude and latitude data by year using a specified distance threshold. The distance can be specified as a geographic distance or, if a raster_mask is provided, as a pixel distance.
Usage
clean_dup_by_date(
this_species,
threshold,
by_mask = FALSE,
raster_mask = NULL,
n_ngbs = 0
)
Arguments
this_species |
An object of class sp.temporal.modeling representing
species occurrence data organized by date.
See |
threshold |
A numeric value representing the distance threshold between
coordinates to be considered duplicates. Units depend on whether
|
by_mask |
Logical. If |
raster_mask |
An object of class SpatRaster that serves as a reference
to thin the occurrence data. Required if |
n_ngbs |
Number of pixels used to define the neighborhood matrix that helps determine which occurrences are duplicates:
|
Details
This function is based on clean_dup
. It cleans up
duplicated occurrences based on the specified threshold. If by_mask
is TRUE
, the distance is interpreted as pixel distance using the provided
raster_mask; otherwise, it is interpreted as geographic distance.
Value
An object of class sp.temporal.modeling containing a temporal data.frame with cleaned occurrence data, including columns for longitude, latitude, date variable, layers_dates, and layers_path.
Examples
library(tenm)
data("abronia")
tempora_layers_dir <- system.file("extdata/bio",package = "tenm")
tenm_mask <- terra::rast(file.path(tempora_layers_dir,"1939/bio_01.tif"))
# Clean duplicates without raster mask (just by distance threshold)
abt <- tenm::sp_temporal_data(occs = abronia,
longitude = "decimalLongitude",
latitude = "decimalLatitude",
sp_date_var = "year",
occ_date_format="y",
layers_date_format= "y",
layers_by_date_dir = tempora_layers_dir,
layers_ext="*.tif$")
abtc1 <- tenm::clean_dup_by_date(abt,threshold = terra::res(tenm_mask)[1])
# Check number of records
print(nrow(abtc1$temporal_df))
# Clean duplicates using a raster mask
abtc2 <- tenm::clean_dup_by_date(this_species = abt,
by_mask = TRUE,
threshold = terra::res(tenm_mask)[1],
raster_mask = tenm_mask,
n_ngbs = 0)
# Check number of records
print(nrow(abtc2$temporal_df))
abtc3 <- tenm::clean_dup_by_date(this_species = abt,
by_mask = TRUE,
threshold = terra::res(tenm_mask)[1],
raster_mask = tenm_mask,
n_ngbs = 2)
# Check number of records
print(nrow(abtc3$temporal_df))