mlr3oml-package {mlr3oml} | R Documentation |
mlr3oml: Connector Between 'mlr3' and 'OpenML'
Description
Provides an interface to 'OpenML.org' to list and download machine learning data, tasks and experiments. The 'OpenML' objects can be automatically converted to 'mlr3' objects. For a more sophisticated interface with more upload options, see the 'OpenML' package.
Documentation
Start by reading the Large-Scale Benchmarking chapter from the mlr3book.
mlr3 Integration
This package adds the mlr3::Task "oml"
and the mlr3::Resampling "oml"
to
mlr3::mlr_tasks and mlr3::mlr_resamplings, respectively.
For the former you may pass either a data_id
or a task_id
, the latter requires
a task_id
.
Furthermore it allows to convert the OpenML objects to mlr3 objects using the usual S3 generics
such as mlr3::as_task, mlr3::as_learner, mlr3::as_resampling, mlr3::as_resample_result,
mlr3::as_benchmark_result or mlr3::as_data_backend. This allows for a frictionless
integration of OpenML and mlr3.
Options
-
mlr3oml.cache
: Enables or disables caching globally. If set toFALSE
, caching is disabled. If set toTRUE
, cache directory as reported byR_user_dir()
is used. Alternatively, you can specify a path on the local file system here. Default isFALSE
. -
mlr3oml.api_key
: API key to use. All operations supported by this package work without an API key, but you might get rate limited without an API key. If not set, defaults to the value of the environment variableOPENMLAPIKEY
. -
mlr3oml.arff_parser
: ARFF parser to use, defaults to the internal one relies ondata.table::fread()
. Can also be set to"RWeka"
for the parser in RWeka. -
mlr3oml.parquet
: Enables or disables parquet as the default file format. If set toTRUE
, the parquet version of datasets will be used by default. If set toFALSE
, the arff version of datasets will be used by default. Note that the OpenML sever is still transitioning from arff to parquet and some features will work better with arff. Default isFALSE
. -
mlr3oml.retries
: An integer defining number of retries when downloading data from OpenML. If it isNULL
, the number of retries is set to 3.
Relevant for developers
-
mlr3oml.test_server
: The default value for whether to use the OpenML test server. Default isFALSE
. -
mlr3oml.test_api_key
: API key to use for the test server. If not set, defaults to the value of the environment variableTESTOPENMLAPIKEY
.
Logging
The lgr package is used for logging.
To change the threshold, use lgr::get_logger("mlr3oml")$set_threshold()
.
Author(s)
Maintainer: Sebastian Fischer sebf.fischer@gmail.com (ORCID)
Authors:
Michel Lang michellang@gmail.com (ORCID)
See Also
Useful links:
Report bugs at https://github.com/mlr-org/mlr3oml/issues