pmlb {pmlbr} | R Documentation |
pmlb: R interface to the Penn Machine Learning Benchmarks data repository
Description
The PMLB repository contains a curated collection of data sets for evaluating and comparing machine learning algorithms. These data sets cover a range of applications, and include binary/multi-class classification problems and regression problems, as well as combinations of categorical, ordinal, and continuous features. There are approximately 290 data sets included in the PMLB repository and there are no missing values in these data sets.
Details
This R library includes summaries of the classification and regression data sets but does NOT
include any of the PMLB data sets. The data sets can be downloaded using the fetch_data
function which
is similar to the corresponding PMLB python function.
See fetch_data
, summary_stats
for usage examples and further information.
If you use PMLB in a scientific publication, please consider citing the following paper:
Randal S. Olson, William La Cava, Patryk Orzechowski, Ryan J. Urbanowicz, and Jason H. Moore (2017).
PMLB: a large benchmark suite for machine learning evaluation and comparison
https://biodatamining.biomedcentral.com/articles/10.1186/s13040-017-0154-4
BioData Mining 10, page 36.
I have no affiliation with the authors of PMLB or the University of Pennsylvania.