list_oml {mlr3oml} | R Documentation |
List Data from OpenML
Description
This function allows to query data sets, tasks, flows, setups, runs, and evaluation measures from https://www.openml.org/search?type=data&sort=runs&status=active using some simple filter criteria.
To find datasets for a specific task type, use list_oml_tasks()
which supports filtering according to the task
type.
Another heuristic to search for possible regression tasks is to search for data sets with
0 number of classes, i.e. by specifying number_classes = 0
.
Usage
list_oml_data(
data_id = NULL,
data_name = NULL,
number_instances = NULL,
number_features = NULL,
number_classes = NULL,
number_missing_values = NULL,
tag = NULL,
limit = limit_default(),
test_server = test_server_default(),
...
)
list_oml_evaluations(
run_id = NULL,
task_id = NULL,
measures = NULL,
tag = NULL,
limit = limit_default(),
test_server = test_server_default(),
...
)
list_oml_flows(
uploader = NULL,
tag = NULL,
limit = limit_default(),
test_server = test_server_default(),
...
)
list_oml_measures(test_server = test_server_default())
list_oml_runs(
run_id = NULL,
task_id = NULL,
tag = NULL,
flow_id = NULL,
limit = limit_default(),
test_server = test_server_default(),
...
)
list_oml_setups(
flow_id = NULL,
setup_id = NULL,
tag = NULL,
limit = limit_default(),
test_server = test_server_default(),
...
)
list_oml_tasks(
task_id = NULL,
data_id = NULL,
number_instances = NULL,
number_features = NULL,
number_classes = NULL,
number_missing_values = NULL,
tag = NULL,
limit = limit_default(),
test_server = test_server_default(),
type = NULL,
...
)
Arguments
data_id |
( |
data_name |
( |
number_instances |
( |
number_features |
( |
number_classes |
( |
number_missing_values |
( |
tag |
( |
limit |
( |
test_server |
( |
... |
(any) |
run_id |
( |
task_id |
( |
measures |
( |
uploader |
( |
flow_id |
( |
setup_id |
( |
type |
( |
Details
Filter values are usually provided as single atomic values (typically integer or character).
Provide a numeric vector of length 2 (c(l, u)
) to find matches in the range [l, u]
.
Note that only a subset of filters is exposed here.
For a more feature-complete package, see OpenML.
Alternatively, you can pass additional filters via ...
using the names of the official API,
c.f. the REST tab of https://www.openml.org/apis.
Value
(data.table()
) of results, or a null data.table if no data set matches the filter criteria.
References
Casalicchio G, Bossek J, Lang M, Kirchhoff D, Kerschke P, Hofner B, Seibold H, Vanschoren J, Bischl B (2017). “OpenML: An R Package to Connect to the Machine Learning Platform OpenML.” Computational Statistics, 1–15. doi:10.1007/s00180-017-0742-2.
Vanschoren J, van Rijn JN, Bischl B, Torgo L (2014). “OpenML.” ACM SIGKDD Explorations Newsletter, 15(2), 49–60. doi:10.1145/2641190.2641198.
Examples
# For technical reasons, examples cannot be included in this R package.
# Instead, these are some relevant resources:
#
# Large-Scale Benchmarking chapter in the mlr3book:
# https://mlr3book.mlr-org.com/chapters/chapter11/large-scale_benchmarking.html
#
# Package Article:
# https://mlr3oml.mlr-org.com/articles/tutorial.html