extract_buffered_coords {dynamicSDM} | R Documentation |
Extract spatially buffered and temporally dynamic explanatory variable data for occurrence records.
Description
For each species occurrence record co-ordinate and date, spatially buffered and temporally dynamic explanatory data are extracted using Google Earth Engine.
Usage
extract_buffered_coords(
occ.data,
datasetname,
bandname,
spatial.res.metres,
GEE.math.fun,
moving.window.matrix,
user.email,
save.method,
varname,
temporal.res,
temporal.level,
temporal.direction,
categories,
save.directory,
agg.factor,
prj = "+proj=longlat +datum=WGS84",
resume = TRUE
)
Arguments
occ.data |
a data frame, with columns for occurrence record co-ordinates and dates with column names as follows; record longitude as "x", latitude as "y", year as "year", month as "month", and day as "day". |
datasetname |
a character string, the Google Earth Engine dataset to extract data from. |
bandname |
a character string, the Google Earth Engine dataset bandname to extract data for. |
spatial.res.metres |
a numeric value, the spatial resolution in metres for data extraction. |
GEE.math.fun |
a character string, the mathematical function to compute across the specified spatial matrix and period for each record. |
moving.window.matrix |
a matrix of weights with an odd number of sides, representing the
spatial neighbourhood of cells (“moving window”) to calculate |
user.email |
a character string, user email for initialising Google Drive. |
save.method |
a character string, the method used to save extracted variable data. One of
|
varname |
optional; a character string, a unique name for the explanatory variable. Default varname is “bandname_temporal.res_temporal.direction_GEE.math.fun_buffered". |
temporal.res |
optional; a numeric value, the temporal resolution in days to extract data and
calculate |
temporal.level |
a character string, the temporal resolution of the explanatory variable
data. One of |
temporal.direction |
optional; a character string, the temporal direction for extracting data
across relative to the record date. One of |
categories |
optional; a character string, the categories to use in calculation if data are categorical. See details for more information. |
save.directory |
a character string, path to a local directory to save extracted variable data to. |
agg.factor |
optional;a postive integer, the aggregation factor expressed as number of cells in each direction. See details. |
prj |
a character string, the coordinate reference system of |
resume |
a logical indicating whether to search |
Details
For each individual species occurrence record co-ordinate and date, this function extracts data for a given band within a Google Earth Engine dataset across a user-specified spatial buffer and temporal period and calculates a mathematical function on such data.
Value
Returns details of successful explanatory variable extractions.
Temporal dimension
If temporal.res
and temporal.direction
are not given, the function
extracts explanatory variable data for all of the cells surrounding and including the cell
containing the occurrence record co-ordinates.
If temporal.res
and temporal.direction
is given, the function extracts explanatory variable
data for which GEE.math.fun
has been first calculated over this period in relation to the
occurrence record date.
Spatial dimension
Using the focal
function from terra
R package (Hijmans et al., 2022), GEE.math.fun
is
calculated across the spatial buffer area from the record co-ordinate. The spatial buffer area
used is specified by the argument moving.window.matrix
, which dictates the neighbourhood of
cells surrounding the cell containing the occurrence record to include in this calculation.
See function get_moving_window()
to generate appropriate moving.window.matrix
.
Mathematical function
GEE.math.fun
specifies the mathematical function to be calculated over the spatial buffered
area and temporal period. Options are limited to Google Earth Engine ImageCollection Reducer
functions (https://developers.google.com/earth-engine/apidocs/) for which an analogous R
function is available. This includes: "allNonZero","anyNonZero", "count",
"first","firstNonNull", "last", "lastNonNull", "max","mean", "median","min", "mode","product",
"sampleStdDev", "sampleVariance", "stdDev", "sum" and "variance".
Categorical data
When explanatory variable data are categorical (e.g. land cover classes), argument categories
can be used to specify the categories of importance to the calculation. The category or
categories given will be converted in a binary representation, with “1” for those listed, and
“0” for all others in the dataset. Ensure that the GEE.math.fun
given is appropriate for such
data. For example, the sum of suitable land cover classified cells across the “moving window”
from the species occurrence record co-ordinates.
Categorical data and temporally dynamic variables
Please be aware, if specific categories are given (argument categories
) when extracting
categorical data, then temporal buffering cannot be completed. The most recent categorical data
to the occurrence record date will be used for spatial buffering.
If specific categories are not given when extracting from categorical datasets, be careful to choose appropriate mathematical functions for such data. For instance, "first" or "last" may be more relevant that "sum" of land cover classification numbers.
Temporal level to extract data at:
temporal.level
states the temporal resolution of the explanatory variable data and improves
the speed of extract_buffered_coords()
extraction. For example, if the explanatory data
represents an annual variable, then all record co-ordinates from the same year can be extracted
from the same buffered raster, saving computation time. However, if the explanatory data
represents a daily variable, then only records from the exact same day can be extracted from the
same raster. For the former, temporal.level
argument should be year
and for the latter,
temporal.level
should be day
.
Aggregation factor
agg.factor
given represents the factor to aggregate RasterLayer
data with function
aggregate
in terra
R package (Hijmans et al., 2022). Aggregation uses the GEE.math.fun
as
the function. Following aggregation spatial buffering using the moving window matrix occurs.
This is included to minimise computing time if data are of high spatial resolution and a large
spatial buffer is needed. Ensure to calculate get_moving_window()
with the spatial resolution
of the data post-aggregation by this factor.
Google Earth Engine
extract_buffered_coords()
requires users to have installed R package rgee
(Aybar et al.,
2020) and initialised Google Earth Engine with valid log-in credentials. Please follow
instructions on the following website https://cran.r-project.org/package=rgee
-
datasetname
must be in the accepted Google Earth Engine catalogue layout (e.g. “MODIS/006/MCD12Q1” or “UCSB-CHG/CHIRPS/DAILY”) -
bandname
must be as specified under the dataset in the Google Earth Engine catalogue (e.g. “LC_Type5”, “precipitation”). For datasets and band names, see https://developers.google.com/earth-engine/datasets.
Google Drive
extract_buffered_coords()
also requires users to have installed the R package
googledrive
(D'Agostino McGowan and Bryan, 2022) and initialised Google Drive with valid log-in
credentials, which must be stated using argument user.email
. Please follow instructions on
https://googledrive.tidyverse.org/ for initialising the googledrive
package.
Note: When running this function a folder labelled "dynamicSDM_download_bucket" will be created in your Google Drive. This will be emptied once the function has finished running and output rasters will be found in the save.drive.folder or save.directory specified.
Exporting extracted data
For save.method
= combined
, the function with save “csv” files containing all occurrence
records and associated values for the explanatory variable.
For save.method
= split
, the function will save individual “csv” files for each record with
each unique period of the given temporal.level (e.g. each year, each year and month combination
or each unique date).
split
protects users if internet connection is lost when extracting data for large occurrence
datasets. The argument resume
can be used to resume to previous progress if connection is
lost.
References
Aybar, C., Wu, Q., Bautista, L., Yali, R. and Barja, A., 2020. rgee: An R package for interacting with Google Earth Engine. Journal of Open Source Software, 5(51), p.2272.
D'Agostino McGowan L., and Bryan J., 2022. googledrive: An Interface to Google Drive. https://googledrive.tidyverse.org, https://github.com/tidyverse/googledrive.
Hijmans, R.J., Bivand, R., Forner, K., Ooms, J., Pebesma, E. and Sumner, M.D., 2022. Package 'terra'. Maintainer: Vienna, Austria.
Examples
data(sample_filt_data)
user.email<-as.character(gargle::gargle_oauth_sitrep()$email)
matrix<-get_moving_window(radial.distance = 10000,
spatial.res.degrees = 0.05,
spatial.ext = sample_extent_data)
extract_buffered_coords(occ.data = sample_filt_data,
datasetname = "MODIS/006/MCD12Q1",
bandname = "LC_Type5",
spatial.res.metres = 500,
GEE.math.fun = "sum",
moving.window.matrix=matrix,
user.email = user.email,
save.method ="split",
temporal.level = "year",
categories = c(6,7),
agg.factor = 12,
varname = "total_grass_crop_lc",
save.directory = tempdir()
)