rfsi {meteo} | R Documentation |
Random Forest Spatial Interpolation (RFSI) model
Description
Function for creation of Random Forest Spatial Interpolation (RFSI) model (Sekulić et al. 2020). Besides environmental covariates, RFSI uses additional spatial covariates: (1) observations at n nearest locations and (2) distances to them, in order to include spatial context into the random forest.
Usage
rfsi(formula,
data,
data.staid.x.y.z = NULL,
n.obs = 5,
avg = FALSE,
increment = 10000,
range = 50000,
quadrant = FALSE,
use.idw = FALSE,
idw.p = 2,
s.crs = NA,
p.crs = NA,
cpus = detectCores()-1,
progress = TRUE,
soil3d = FALSE,
depth.range = 0.1,
no.obs = 'increase',
...)
Arguments
formula |
formula; Formula for specifying target variable and covariates (without nearest observations and distances to them). If |
data |
sf-class, sftime-class, SpatVector-class or data.frame; Contains target variable (observations) and covariates used for making an RFSI model. If data.frame object, it should have next columns: station ID (staid), longitude (x), latitude (y), 3rd component - time, depth, ... (z) of the observation, observation value (obs) and covariates (cov1, cov2, ...). If covariates are missing, the RFSI model using only nearest obsevrations and distances to them as covariates ( |
data.staid.x.y.z |
numeric or character vector; Positions or names of the station ID (staid), longitude (x), latitude (y) and 3rd component (z) columns in data.frame object (e.g. c(1,2,3,4)). If |
n.obs |
numeric; Number of nearest observations to be used as covariates in RFSI model (see function near.obs). Note that it cannot be larger than number of obsevrations. Default is 5. |
avg |
boolean; Averages in circles covariate - will averages in circles with different radiuses be calculated (see function near.obs). Default is FALSE. |
increment |
numeric; Increment of radiuses for calculation of averages in circles with different radiuses (see function near.obs). Units depends on CRS. |
range |
numeric; Maximum radius for calculation of averages in circles with different radiuses (see function near.obs). Units depends on CRS. |
quadrant |
boolean; Nearest observations in quadrants covariate - will nearest observation in quadrants be calculated (see function near.obs). Default is FALSE. |
use.idw |
boolean; IDW prediction as covariate - will IDW predictions from |
idw.p |
numeric; Exponent parameter for IDW weights (see function near.obs). Default is 2. |
s.crs |
st_crs or crs; Source CRS of |
p.crs |
st_crs or crs; Projection CRS for |
cpus |
numeric; Number of processing units. Default is detectCores()-1. |
progress |
logical; If progress bar is shown. Default is TRUE. |
soil3d |
logical; If 3D soil modellig is performed and near.obs.soil function is used for finding n nearest observations and distances to them. In this case, z position of the |
depth.range |
numeric; Depth range for location mid depth in which to search for nearest observations (see function near.obs.soil). It's in the mid depth units. Default is 0.1. |
no.obs |
character; Possible values are |
... |
Further arguments passed to ranger, such as |
Value
RFSI model of class ranger.
Note
Observations should be in projection for finding nearest observations based on Eucleadean distances (see function near.obs). If crs is not specified in the data
object or through the s.crs
parameter, the coordinates will be used as they are in projection. Use s.crs
and p.crs
if the coordinates of the data
object are in lon/lat (WGS84).
Author(s)
Aleksandar Sekulic asekulic@grf.bg.ac.rs
References
Sekulić, A., Kilibarda, M., Heuvelink, G. B., Nikolić, M. & Bajat, B. Random Forest Spatial Interpolation. Remote. Sens. 12, 1687, https://doi.org/10.3390/rs12101687 (2020).
See Also
near.obs
pred.rfsi
tune.rfsi
cv.rfsi
Examples
library(ranger)
library(sp)
library(sf)
library(terra)
library(meteo)
# prepare data
demo(meuse, echo=FALSE)
meuse <- meuse[complete.cases(meuse@data),]
data = st_as_sf(meuse, coords = c("x", "y"), crs = 28992, agr = "constant")
fm.RFSI <- as.formula("zinc ~ dist + soil + ffreq")
# fit the RFSI model
rfsi_model <- rfsi(formula = fm.RFSI,
data = data, # meuse.df (use data.staid.x.y.z)
n.obs = 5, # number of nearest observations
cpus = 2, # detectCores()-1,
progress = TRUE,
# ranger parameters
importance = "impurity",
seed = 42,
num.trees = 250,
mtry = 5,
splitrule = "variance",
min.node.size = 5,
sample.fraction = 0.95,
quantreg = FALSE)
rfsi_model
# OOB prediction error (MSE): 47758.14
# R squared (OOB): 0.6435869
sort(rfsi_model$variable.importance)
sum("obs" == substr(rfsi_model$forest$independent.variable.names, 1, 3))