tune.rfsi {meteo} | R Documentation |
Tuning of Random Forest Spatial Interpolation (RFSI) model
Description
Function for tuning of Random Forest Spatial Interpolation (RFSI) model using k-fold leave-location-out cross-validation (Sekulić et al. 2020).
Usage
tune.rfsi(formula,
data,
data.staid.x.y.z = NULL,
use.idw = FALSE,
s.crs = NA,
p.crs = NA,
tgrid,
tgrid.n=10,
tune.type = "LLO",
k = 5,
seed=42,
folds,
acc.metric,
fit.final.model=TRUE,
cpus = detectCores()-1,
progress = TRUE,
soil3d = FALSE,
no.obs = 'increase',
...)
Arguments
formula |
formula; Formula for specifying target variable and covariates (without nearest observations and distances to them). If |
data |
sf-class, sftime-class, SpatVector-class or data.frame; Contains target variable (observations) and covariates used for making an RFSI model. If data.frame object, it should have next columns: station ID (staid), longitude (x), latitude (y), 3rd component - time, depth, ... (z) of the observation, observation value (obs) and covariates (cov1, cov2, ...). If covariates are missing, the RFSI model using only nearest obsevrations and distances to them as covariates ( |
data.staid.x.y.z |
numeric or character vector; Positions or names of the station ID (staid), longitude (x), latitude (y) and 3rd component (z) columns in data.frame object (e.g. c(1,2,3,4)). If |
use.idw |
boolean; IDW prediction as covariate - will IDW predictions from |
s.crs |
st_crs or crs; Source CRS of |
p.crs |
st_crs or crs; Projection CRS for |
tgrid |
data.frame; Possible tuning parameters. The column names are same as the tuning parameters. Possible tuning parameters are: |
tgrid.n |
numeric; Number of randomly chosen |
tune.type |
character; Type of cross-validation: leave-location-out ("LLO"), leave-time-out ("LTO") - TO DO, and leave-location-time-out ("LLTO") - TO DO. Default is "LLO". |
k |
numeric; Number of random folds that will be created with CreateSpacetimeFolds function if |
seed |
numeric; Random seed that will be used to generate folds with CreateSpacetimeFolds function. |
folds |
numeric or character vector or value; Showing folds column (if value) or rows (vector) of |
acc.metric |
character; Accuracy metric that will be used as a criteria for choosing an optimal RFSI model. Possible values for regression: "ME", "MAE", "NMAE", "RMSE" (default), "NRMSE", "R2", "CCC". Possible values for classification: "Accuracy","Kappa" (default), "AccuracyLower", "AccuracyUpper", "AccuracyNull", "AccuracyPValue", "McnemarPValue". |
fit.final.model |
boolean; Fit the final RFSI model. Defailt is TRUE. |
cpus |
numeric; Number of processing units. Default is detectCores()-1. |
progress |
logical; If progress bar is shown. Default is TRUE. |
soil3d |
logical; If 3D soil modellig is performed and near.obs.soil function is used for finding n nearest observations and distances to them. In this case, z position of the |
no.obs |
character; Possible values are |
... |
Further arguments passed to ranger. |
Value
A list with elements:
combinations |
data.frame; All tuned parameter combinations with chosen accuracy metric value. |
tuned.parameters |
numeric vector; Tuned parameters with chosen accuracy metric value. |
final.model |
ranger; Final RFSI model (if |
Author(s)
Aleksandar Sekulic asekulic@grf.bg.ac.rs
References
Sekulić, A., Kilibarda, M., Heuvelink, G. B., Nikolić, M. & Bajat, B. Random Forest Spatial Interpolation. Remote. Sens. 12, 1687, https://doi.org/10.3390/rs12101687 (2020).
See Also
near.obs
rfsi
pred.rfsi
cv.rfsi
Examples
library(CAST)
library(doParallel)
library(ranger)
library(sp)
library(sf)
library(terra)
library(meteo)
# prepare data
demo(meuse, echo=FALSE)
meuse <- meuse[complete.cases(meuse@data),]
data = st_as_sf(meuse, coords = c("x", "y"), crs = 28992, agr = "constant")
fm.RFSI <- as.formula("zinc ~ dist + soil + ffreq")
# making tgrid
n.obs <- 1:6
min.node.size <- 2:10
sample.fraction <- seq(1, 0.632, -0.05) # 0.632 without / 1 with replacement
splitrule <- "variance"
ntree <- 250 # 500
mtry <- 3:(2+2*max(n.obs))
tgrid = expand.grid(min.node.size=min.node.size, num.trees=ntree,
mtry=mtry, n.obs=n.obs, sample.fraction=sample.fraction)
# Tune RFSI model
rfsi_tuned <- tune.rfsi(formula = fm.RFSI,
data = data,
# data.staid.x.y.z = data.staid.x.y.z, # data.frame
# s.crs = st_crs(data),
# p.crs = st_crs(data),
tgrid = tgrid, # combinations for tuning
tgrid.n = 20, # number of randomly selected combinations from tgrid
tune.type = "LLO", # Leave-Location-Out CV
k = 5, # number of folds
seed = 42,
acc.metric = "RMSE", # R2, CCC, MAE
fit.final.model = TRUE,
cpus = 2, # detectCores()-1,
progress = TRUE,
importance = "impurity") # ranger parameter
rfsi_tuned$combinations
rfsi_tuned$tuned.parameters
# min.node.size num.trees mtry n.obs sample.fraction RMSE
# 3701 3 250 6 5 0.75 222.6752
rfsi_tuned$final.model
# OOB prediction error (MSE): 46666.51
# R squared (OOB): 0.6517336