gen_friedman {fastshap}R Documentation

Friedman benchmark data

Description

Simulate data from the Friedman 1 benchmark problem. These data were originally described in Friedman (1991) and Breiman (1996). For details, see sklearn.datasets.make_friedman1.

Usage

gen_friedman(
  n_samples = 100,
  n_features = 10,
  n_bins = NULL,
  sigma = 0.1,
  seed = NULL
)

Arguments

n_samples

Integer specifying the number of samples (i.e., rows) to generate. Default is 100.

n_features

Integer specifying the number of features to generate. Default is 10.

n_bins

Integer specifying the number of (roughly) equal sized bins to split the response into. Default is NULL for no binning. Setting to a positive integer > 1 effectively turns this into a classification problem where n_bins gives the number of classes.

sigma

Numeric specifying the standard deviation of the noise.

seed

Integer specifying the random seed. If NULL (the default) the results will be different each time the function is run.

Note

This function is mostly used for internal testing.

References

Breiman, Leo (1996) Bagging predictors. Machine Learning 24, pages 123-140.

Friedman, Jerome H. (1991) Multivariate adaptive regression splines. The Annals of Statistics 19 (1), pages 1-67.

Examples

gen_friedman()

[Package fastshap version 0.1.1 Index]