spatiotemp_block {dynamicSDM} | R Documentation |
Split occurrence records into spatial and temporal blocks for model fitting.
Description
Splits occurrence records into spatial and temporal sampling units and groups sampling units into multiple blocks that have similar mean and range of environmental explanatory variables and sample size.
Usage
spatiotemp_block(
occ.data,
vars.to.block.by,
spatial.layer,
spatial.split.degrees,
temporal.block,
n.blocks = 10,
iterations = 5000
)
Arguments
occ.data |
a data frame, with columns for occurrence record co-ordinates and dates with column names as follows; record longitude as "x", latitude as "y", year as "year", month as "month", and day as "day", and associated explanatory variable data. |
vars.to.block.by |
a character string or vector, the explanatory variable column names to group sampling units based upon. |
spatial.layer |
optional; a |
spatial.split.degrees |
a numeric value, the grid cell resolution in degrees to split
|
temporal.block |
optional; a character string or vector, the time step for sampling unit
splitting. Any combination of |
n.blocks |
optional; a numeric value of two or more, the number of blocks to group occurrence records into. Default; 10. |
iterations |
optional; a numeric value, the number of random block groupings to trial before selecting the optimal grouping. Default; 5000. |
Value
Returns occurrence data frame with column "BLOCK.CATS", assigning each record to a spatiotemporal block.
Blocking for autocorrelation
Blocking is an established method to account for spatial autocorrelation in SDMs. Following Bagchi et al., (2013), the blocking method involves splitting occurrence data into sampling units based upon non-contiguous ecoregions, which are then grouped into spatially disaggregated blocks of approximately equal sample size, within which the mean and range of explanatory variable data are similar. When species distribution model fitting, blocks are left out in-turn in a jack-knife approach for model training and testing.
We adapt this approach to account for temporal autocorrelation by enabling users to split records into sampling units based upon spatial and temporal characteristic before blocking occurs.
Spatial splitting
If the spatial.layer
has categories that take up large contiguous areas,
spatiotemp_block()
will split categories into smaller units using grid cells at specified
resolution (spatial.split.degrees
).
Temporal splitting
If temporal.block
is given, then occurrence records with unique values for the given level are
considered unique sampling unit. For instance, if temporal.block
= year
, then records from the
same year are considered a sampling unit to be grouped into blocks.
Note: If spatial splitting is also used, then spatial characteristics may split these further into separate sampling units.
The temporal.block
option quarter
splits occurrence records into sampling units based on which
quarter of the year the record month belongs to: (1) January-March, (2) April-June, (3)
July-September and (4) October-December. This could be employed if seasonal biases in occurrence
record collection are driving autocorrelation.
Block generation
Once split into sampling units based upon temporal and spatial characteristics, these units are
then assigned into given number of blocks (n.blocks
), so that the mean and range of explanatory
variables (vars.to.block.by
) and total sample size are similar across each. The number of
iterations
specifies how many random shuffles are used to optimise block equalisation.
References
Bagchi, R., Crosby, M., Huntley, B., Hole, D. G., Butchart, S. H. M., Collingham, Y., Kalra, M., Rajkumar, J., Rahmani, A. & Pandey, M. 2013. Evaluating the effectiveness of conservation site networks under climate change: accounting for uncertainty. Global Change Biology, 19, 1236-1248.
Examples
data("sample_explan_data")
data("sample_extent_data")
random_cat_layer <- terra::rast(sample_extent_data)
random_cat_layer <- terra::setValues(random_cat_layer,
sample(0:10, terra::ncell(random_cat_layer),
replace = TRUE))
spatiotemp_block(
occ.data = sample_explan_data,
spatial.layer = random_cat_layer,
spatial.split.degrees = 3,
temporal.block = c("month"),
vars.to.block.by = colnames(sample_explan_data)[14:16],
n.blocks = 3,
iterations = 30
)