cv_cluster {stressor}R Documentation

Spatial Cluster-Based Partitions for Cross-Validation

Description

This function creates cluster-based partitions of a sample space based on k-means clustering. Included in the function are algorithms that attempt to produce clusters of roughly equal size.

Usage

cv_cluster(features, k, k_mult = 5, ...)

Arguments

features

A scaled matrix of features to be used in the clustering. Scaling usually done with scale and should not include the predictor variable.

k

The number of partitions for k-fold cross-validation.

k_mult

k*k_mult determines the number of subgroups that will be created as part of the balancing algorithm.

...

Additional arguments passed to kmeans as needed.

Details

More information regarding spatial cross-validation can be found in Robin Lovelace's explanation of spatial cross-validation in his textbook.

Value

An integer vector that is number of rows of features with indices of each group.

Examples

 # Creating a matrix of predictor variables
 x_data <- base::scale(data_gen_lm(30)[, -1])
 groups <- cv_cluster(x_data, 5, k_mult = 5)
 groups

[Package stressor version 0.2.0 Index]