extract_buffered_coords {dynamicSDM}  R Documentation 
Extract spatially buffered and temporally dynamic explanatory variable data for occurrence records.
Description
For each species occurrence record coordinate and date, spatially buffered and temporally dynamic explanatory data are extracted using Google Earth Engine.
Usage
extract_buffered_coords(
occ.data,
datasetname,
bandname,
spatial.res.metres,
GEE.math.fun,
moving.window.matrix,
user.email,
save.method,
varname,
temporal.res,
temporal.level,
temporal.direction,
categories,
save.directory,
agg.factor,
prj = "+proj=longlat +datum=WGS84",
resume = TRUE
)
Arguments
occ.data 
a data frame, with columns for occurrence record coordinates and dates with column names as follows; record longitude as "x", latitude as "y", year as "year", month as "month", and day as "day". 
datasetname 
a character string, the Google Earth Engine dataset to extract data from. 
bandname 
a character string, the Google Earth Engine dataset bandname to extract data for. 
spatial.res.metres 
a numeric value, the spatial resolution in metres for data extraction. 
GEE.math.fun 
a character string, the mathematical function to compute across the specified spatial matrix and period for each record. 
moving.window.matrix 
a matrix of weights with an odd number of sides, representing the
spatial neighbourhood of cells (“moving window”) to calculate 
user.email 
a character string, user email for initialising Google Drive. 
save.method 
a character string, the method used to save extracted variable data. One of

varname 
optional; a character string, a unique name for the explanatory variable. Default varname is “bandname_temporal.res_temporal.direction_GEE.math.fun_buffered". 
temporal.res 
optional; a numeric value, the temporal resolution in days to extract data and
calculate 
temporal.level 
a character string, the temporal resolution of the explanatory variable
data. One of 
temporal.direction 
optional; a character string, the temporal direction for extracting data
across relative to the record date. One of 
categories 
optional; a character string, the categories to use in calculation if data are categorical. See details for more information. 
save.directory 
a character string, path to a local directory to save extracted variable data to. 
agg.factor 
optional;a postive integer, the aggregation factor expressed as number of cells in each direction. See details. 
prj 
a character string, the coordinate reference system of 
resume 
a logical indicating whether to search 
Details
For each individual species occurrence record coordinate and date, this function extracts data for a given band within a Google Earth Engine dataset across a userspecified spatial buffer and temporal period and calculates a mathematical function on such data.
Value
Returns details of successful explanatory variable extractions.
Temporal dimension
If temporal.res
and temporal.direction
are not given, the function
extracts explanatory variable data for all of the cells surrounding and including the cell
containing the occurrence record coordinates.
If temporal.res
and temporal.direction
is given, the function extracts explanatory variable
data for which GEE.math.fun
has been first calculated over this period in relation to the
occurrence record date.
Spatial dimension
Using the focal
function from terra
R package (Hijmans et al., 2022), GEE.math.fun
is
calculated across the spatial buffer area from the record coordinate. The spatial buffer area
used is specified by the argument moving.window.matrix
, which dictates the neighbourhood of
cells surrounding the cell containing the occurrence record to include in this calculation.
See function get_moving_window()
to generate appropriate moving.window.matrix
.
Mathematical function
GEE.math.fun
specifies the mathematical function to be calculated over the spatial buffered
area and temporal period. Options are limited to Google Earth Engine ImageCollection Reducer
functions (https://developers.google.com/earthengine/apidocs/) for which an analogous R
function is available. This includes: "allNonZero","anyNonZero", "count",
"first","firstNonNull", "last", "lastNonNull", "max","mean", "median","min", "mode","product",
"sampleStdDev", "sampleVariance", "stdDev", "sum" and "variance".
Categorical data
When explanatory variable data are categorical (e.g. land cover classes), argument categories
can be used to specify the categories of importance to the calculation. The category or
categories given will be converted in a binary representation, with “1” for those listed, and
“0” for all others in the dataset. Ensure that the GEE.math.fun
given is appropriate for such
data. For example, the sum of suitable land cover classified cells across the “moving window”
from the species occurrence record coordinates.
Categorical data and temporally dynamic variables
Please be aware, if specific categories are given (argument categories
) when extracting
categorical data, then temporal buffering cannot be completed. The most recent categorical data
to the occurrence record date will be used for spatial buffering.
If specific categories are not given when extracting from categorical datasets, be careful to choose appropriate mathematical functions for such data. For instance, "first" or "last" may be more relevant that "sum" of land cover classification numbers.
Temporal level to extract data at:
temporal.level
states the temporal resolution of the explanatory variable data and improves
the speed of extract_buffered_coords()
extraction. For example, if the explanatory data
represents an annual variable, then all record coordinates from the same year can be extracted
from the same buffered raster, saving computation time. However, if the explanatory data
represents a daily variable, then only records from the exact same day can be extracted from the
same raster. For the former, temporal.level
argument should be year
and for the latter,
temporal.level
should be day
.
Aggregation factor
agg.factor
given represents the factor to aggregate RasterLayer
data with function
aggregate
in terra
R package (Hijmans et al., 2022). Aggregation uses the GEE.math.fun
as
the function. Following aggregation spatial buffering using the moving window matrix occurs.
This is included to minimise computing time if data are of high spatial resolution and a large
spatial buffer is needed. Ensure to calculate get_moving_window()
with the spatial resolution
of the data postaggregation by this factor.
Google Earth Engine
extract_buffered_coords()
requires users to have installed R package rgee
(Aybar et al.,
2020) and initialised Google Earth Engine with valid login credentials. Please follow
instructions on the following website https://cran.rproject.org/package=rgee

datasetname
must be in the accepted Google Earth Engine catalogue layout (e.g. “MODIS/006/MCD12Q1” or “UCSBCHG/CHIRPS/DAILY”) 
bandname
must be as specified under the dataset in the Google Earth Engine catalogue (e.g. “LC_Type5”, “precipitation”). For datasets and band names, see https://developers.google.com/earthengine/datasets.
Google Drive
extract_buffered_coords()
also requires users to have installed the R package
googledrive
(D'Agostino McGowan and Bryan, 2022) and initialised Google Drive with valid login
credentials, which must be stated using argument user.email
. Please follow instructions on
https://googledrive.tidyverse.org/ for initialising the googledrive
package.
Note: When running this function a folder labelled "dynamicSDM_download_bucket" will be created in your Google Drive. This will be emptied once the function has finished running and output rasters will be found in the save.drive.folder or save.directory specified.
Exporting extracted data
For save.method
= combined
, the function with save “csv” files containing all occurrence
records and associated values for the explanatory variable.
For save.method
= split
, the function will save individual “csv” files for each record with
each unique period of the given temporal.level (e.g. each year, each year and month combination
or each unique date).
split
protects users if internet connection is lost when extracting data for large occurrence
datasets. The argument resume
can be used to resume to previous progress if connection is
lost.
References
Aybar, C., Wu, Q., Bautista, L., Yali, R. and Barja, A., 2020. rgee: An R package for interacting with Google Earth Engine. Journal of Open Source Software, 5(51), p.2272.
D'Agostino McGowan L., and Bryan J., 2022. googledrive: An Interface to Google Drive. https://googledrive.tidyverse.org, https://github.com/tidyverse/googledrive.
Hijmans, R.J., Bivand, R., Forner, K., Ooms, J., Pebesma, E. and Sumner, M.D., 2022. Package 'terra'. Maintainer: Vienna, Austria.
Examples
data(sample_filt_data)
user.email<as.character(gargle::gargle_oauth_sitrep()$email)
matrix<get_moving_window(radial.distance = 10000,
spatial.res.degrees = 0.05,
spatial.ext = sample_extent_data)
extract_buffered_coords(occ.data = sample_filt_data,
datasetname = "MODIS/006/MCD12Q1",
bandname = "LC_Type5",
spatial.res.metres = 500,
GEE.math.fun = "sum",
moving.window.matrix=matrix,
user.email = user.email,
save.method ="split",
temporal.level = "year",
categories = c(6,7),
agg.factor = 12,
varname = "total_grass_crop_lc",
save.directory = tempdir()
)