madelon {sbfc}R Documentation

Madelon data set: synthetic data from NIPS 2003 feature selection challenge

Description

This is a two-class classification problem. The difficulty is that the problem is multivariate and highly non-linear. Of the 500 features, 20 are real features, 480 are noise features.
Data set from UCI repository, discretized using median cutoffs.

Usage

data(madelon)

Format

TrainX

A matrix with 2000 rows and 500 columns.

TrainY

A vector with 2000 rows.

TestX

A matrix with 600 rows and 500 columns.

TestY

A vector with 600 rows.

References

UCI madelon data set


[Package sbfc version 1.0.3 Index]