R: Split occurrence records into spatial and temporal blocks for...

spatiotemp_block {dynamicSDM}

R Documentation

Split occurrence records into spatial and temporal blocks for model fitting.

Description

Splits occurrence records into spatial and temporal sampling units and groups sampling units into multiple blocks that have similar mean and range of environmental explanatory variables and sample size.

Usage

spatiotemp_block(
  occ.data,
  vars.to.block.by,
  spatial.layer,
  spatial.split.degrees,
  temporal.block,
  n.blocks = 10,
  iterations = 5000
)

Arguments

`occ.data`	a data frame, with columns for occurrence record co-ordinates and dates with column names as follows; record longitude as "x", latitude as "y", year as "year", month as "month", and day as "day", and associated explanatory variable data.
`vars.to.block.by`	a character string or vector, the explanatory variable column names to group sampling units based upon.
`spatial.layer`	optional; a `SpatRaster` object, a categorical spatial layer for sample unit splitting.
`spatial.split.degrees`	a numeric value, the grid cell resolution in degrees to split `spatial.layer` by. Required if `spatial.layer` given.
`temporal.block`	optional; a character string or vector, the time step for sampling unit splitting. Any combination of `day`, `month`, `year` or `quarter.` See details.
`n.blocks`	optional; a numeric value of two or more, the number of blocks to group occurrence records into. Default; 10.
`iterations`	optional; a numeric value, the number of random block groupings to trial before selecting the optimal grouping. Default; 5000.

Value

Returns occurrence data frame with column "BLOCK.CATS", assigning each record to a spatiotemporal block.

Blocking for autocorrelation

Blocking is an established method to account for spatial autocorrelation in SDMs. Following Bagchi et al., (2013), the blocking method involves splitting occurrence data into sampling units based upon non-contiguous ecoregions, which are then grouped into spatially disaggregated blocks of approximately equal sample size, within which the mean and range of explanatory variable data are similar. When species distribution model fitting, blocks are left out in-turn in a jack-knife approach for model training and testing.

We adapt this approach to account for temporal autocorrelation by enabling users to split records into sampling units based upon spatial and temporal characteristic before blocking occurs.

Spatial splitting

If the spatial.layer has categories that take up large contiguous areas, spatiotemp_block() will split categories into smaller units using grid cells at specified resolution (spatial.split.degrees).

Temporal splitting

If temporal.block is given, then occurrence records with unique values for the given level are considered unique sampling unit. For instance, if temporal.block = year, then records from the same year are considered a sampling unit to be grouped into blocks.

Note: If spatial splitting is also used, then spatial characteristics may split these further into separate sampling units.

The temporal.block option quarter splits occurrence records into sampling units based on which quarter of the year the record month belongs to: (1) January-March, (2) April-June, (3) July-September and (4) October-December. This could be employed if seasonal biases in occurrence record collection are driving autocorrelation.

Block generation

Once split into sampling units based upon temporal and spatial characteristics, these units are then assigned into given number of blocks (n.blocks), so that the mean and range of explanatory variables (vars.to.block.by) and total sample size are similar across each. The number of iterations specifies how many random shuffles are used to optimise block equalisation.

References

Bagchi, R., Crosby, M., Huntley, B., Hole, D. G., Butchart, S. H. M., Collingham, Y., Kalra, M., Rajkumar, J., Rahmani, A. & Pandey, M. 2013. Evaluating the effectiveness of conservation site networks under climate change: accounting for uncertainty. Global Change Biology, 19, 1236-1248.

Examples


data("sample_explan_data")
data("sample_extent_data")
random_cat_layer <- terra::rast(sample_extent_data)
random_cat_layer <- terra::setValues(random_cat_layer,
                                    sample(0:10, terra::ncell(random_cat_layer),
                                           replace = TRUE))

spatiotemp_block(
 occ.data = sample_explan_data,
 spatial.layer = random_cat_layer,
 spatial.split.degrees = 3,
 temporal.block = c("month"),
 vars.to.block.by = colnames(sample_explan_data)[14:16],
 n.blocks = 3,
 iterations = 30
)

[Package dynamicSDM version 1.3.4 Index]