cv_cluster {stressor} | R Documentation |
Spatial Cluster-Based Partitions for Cross-Validation
Description
This function creates cluster-based partitions of a sample space based on k-means clustering. Included in the function are algorithms that attempt to produce clusters of roughly equal size.
Usage
cv_cluster(features, k, k_mult = 5, ...)
Arguments
features |
A scaled matrix of features to be used in the clustering. Scaling usually done with scale and should not include the predictor variable. |
k |
The number of partitions for k-fold cross-validation. |
k_mult |
k*k_mult determines the number of subgroups that will be created as part of the balancing algorithm. |
... |
Additional arguments passed to kmeans as needed. |
Details
More information regarding spatial cross-validation can be found in Robin Lovelace's explanation of spatial cross-validation in his textbook.
Value
An integer vector that is number of rows of features with indices of each group.
Examples
# Creating a matrix of predictor variables
x_data <- base::scale(data_gen_lm(30)[, -1])
groups <- cv_cluster(x_data, 5, k_mult = 5)
groups