fit_best {tune} | R Documentation |
Fit a model to the numerically optimal configuration
Description
fit_best()
takes the results from model tuning and fits it to the training
set using tuning parameters associated with the best performance.
Usage
fit_best(x, ...)
## Default S3 method:
fit_best(x, ...)
## S3 method for class 'tune_results'
fit_best(
x,
...,
metric = NULL,
eval_time = NULL,
parameters = NULL,
verbose = FALSE,
add_validation_set = NULL
)
Arguments
x |
The results of class |
... |
Not currently used, must be empty. |
metric |
A character string (or |
eval_time |
A single numeric time point where dynamic event time
metrics should be chosen (e.g., the time-dependent ROC curve, etc). The
values should be consistent with the values used to create |
parameters |
An optional 1-row tibble of tuning parameter settings, with
a column for each tuning parameter. This tibble should have columns for each
tuning parameter identifier (e.g. |
verbose |
A logical for printing logging. |
add_validation_set |
When the resamples embedded in |
Details
This function is a shortcut for the manual steps of:
best_param <- select_best(tune_results, metric) # or other `select_*()` wflow <- finalize_workflow(wflow, best_param) # or just `finalize_model()` wflow_fit <- fit(wflow, data_set)
Value
A fitted workflow.
Case Weights
Some models can utilize case weights during training. tidymodels currently supports two types of case weights: importance weights (doubles) and frequency weights (integers). Frequency weights are used during model fitting and evaluation, whereas importance weights are only used during fitting.
To know if your model is capable of using case weights, create a model spec
and test it using parsnip::case_weights_allowed()
.
To use them, you will need a numeric column in your data set that has been
passed through either hardhat:: importance_weights()
or
hardhat::frequency_weights()
.
For functions such as fit_resamples()
and the tune_*()
functions, the
model must be contained inside of a workflows::workflow()
. To declare that
case weights are used, invoke workflows::add_case_weights()
with the
corresponding (unquoted) column name.
From there, the packages will appropriately handle the weights during model fitting and (if appropriate) performance estimation.
See also
last_fit()
is closely related to fit_best()
. They both
give you access to a workflow fitted on the training data but are situated
somewhat differently in the modeling workflow. fit_best()
picks up
after a tuning function like tune_grid()
to take you from tuning results
to fitted workflow, ready for you to predict and assess further. last_fit()
assumes you have made your choice of hyperparameters and finalized your
workflow to then take you from finalized workflow to fitted workflow and
further to performance assessment on the test data. While fit_best()
gives
a fitted workflow, last_fit()
gives you the performance results. If you
want the fitted workflow, you can extract it from the result of last_fit()
via extract_workflow().
Examples
library(recipes)
library(rsample)
library(parsnip)
library(dplyr)
data(meats, package = "modeldata")
meats <- meats %>% select(-water, -fat)
set.seed(1)
meat_split <- initial_split(meats)
meat_train <- training(meat_split)
meat_test <- testing(meat_split)
set.seed(2)
meat_rs <- vfold_cv(meat_train, v = 10)
pca_rec <-
recipe(protein ~ ., data = meat_train) %>%
step_normalize(all_numeric_predictors()) %>%
step_pca(all_numeric_predictors(), num_comp = tune())
knn_mod <- nearest_neighbor(neighbors = tune()) %>% set_mode("regression")
ctrl <- control_grid(save_workflow = TRUE)
set.seed(128)
knn_pca_res <-
tune_grid(knn_mod, pca_rec, resamples = meat_rs, grid = 10, control = ctrl)
knn_fit <- fit_best(knn_pca_res, verbose = TRUE)
predict(knn_fit, meat_test)