dataSplit {DSAM} | R Documentation |
Main function of data splitting algorithm
Description
'DSAM' interface function: The user needs to provide a parameter list before data-splitting.
These parameters have default values, with details given in the par.default
function.
Conditioned on the parameter list, this function carries out the data-splitting based on the algorithm specified by the user.
The available algorithms include the traditional time-consecutive method (TIMECON), DUPLEX, MDUPLEX SOMPLEX, SBSS.P, SS.
The algorithm details can be found in Chen et al. (2022). Note that this package focuses on deals with the dataset with multiple inputs but one output,
where this output is used to enable the application of various data-splitting algorithms.
Usage
dataSplit(data, control = list(), ...)
Arguments
data |
The dataset should be matrix or Data.frame. The format should be as follows: Column one is a subscript vector used to mark each data point (each row is considered as a data point); Columns from 2 to N-1 are the input data, and Column N are the output data. |
control |
User-defined parameter list, where each parameter definition refers to the |
... |
A redundant argument list. |
Value
Return the training, test and validation subsets. If the original data are required to be split into two subsets, the training and test subsets can be combined into a single calibration subset.
Author(s)
Feifei Zheng feifeizheng@zju.edu.cn
Junyi Chen jun1chen@zju.edu.cn
References
Chen, J., Zheng F., May R., Guo D., Gupta H., and Maier H. R.(2022).Improved data splitting methods for data-driven hydrological model development based on a large number of catchment samples, Journal of Hydrology, 613.
Zheng, F., Chen J., Maier H. R., and Gupta H.(2022). Achieving Robust and Transferable Performance for ConservationāBased Models of Dynamical Physical Systems, Water Resources Research, 58(5).
Zheng, F., Chen, J., Ma, Y., Chen Q., Maier H. R., and Gupta H.(2023). A Robust Strategy to Account for Data Sampling Variability in the Development of Hydrological Models, Water Resources Research, 59(3).
Examples
data("DSAM_test_smallData")
res.sml = dataSplit(DSAM_test_smallData)
data("DSAM_test_modData")
res.mod = dataSplit(DSAM_test_modData, list(sel.alg = "SBSS.P"))
data("DSAM_test_largeData")
res.lag = dataSplit(DSAM_test_largeData, list(sel.alg = "SOMPLEX"))