train_smooth_data {gcplyr} | R Documentation |
Test efficacy of different smoothing parameters
Description
This function is based on caret::train
, which runs models
(in our case different smoothing algorithms) on data across different
parameter values (in our case different smoothness parameters).
Usage
train_smooth_data(
...,
x = NULL,
y = NULL,
sm_method,
preProcess = NULL,
weights = NULL,
metric = ifelse(is.factor(y), "Accuracy", "RMSE"),
maximize = ifelse(metric %in% c("RMSE", "logLoss", "MAE", "logLoss"), FALSE, TRUE),
trControl = caret::trainControl(method = "cv"),
tuneGrid = NULL,
tuneLength = ifelse(trControl$method == "none", 1, 3),
return_trainobject = FALSE
)
Arguments
... |
Arguments passed to |
x |
A vector of predictor values to smooth along (e.g. time) |
y |
A vector of response values to be smoothed (e.g. density). |
sm_method |
Argument specifying which smoothing method should be used. Options include "moving-average", "moving-median", "loess", "gam", and "smooth.spline". |
preProcess |
A string vector that defines a pre-processing of the
predictor data. The default is no pre-processing.
See |
weights |
A numeric vector of case weights. This argument currently
does not affect any |
metric |
A string that specifies what summary metric will be
used to select the optimal model. By default, possible
values are "RMSE" and "Rsquared" for regression.
See |
maximize |
A logical: should the metric be maximized or minimized? |
trControl |
A list of values that define how this function acts.
See |
tuneGrid |
A data frame with possible tuning values, or a named list
containing vectors with possible tuning values. If a data
frame, the columns should be named the same as the tuning
parameters. If a list, the elements of the list should be
named the same as the tuning parameters. If a list,
|
tuneLength |
An integer denoting the amount of granularity in
the tuning parameter grid. By default, this argument
is the number of levels for each tuning parameter that
should be generated. If |
return_trainobject |
A logical indicating whether the entire result
of |
Details
See caret::train
for more information.
The default method is k-fold cross-validation
(trControl = caret::trainControl(method = "cv")
).
For less variable, but more computationally costly, cross-validation,
users may choose to increase the number of folds. This can be
done by altering the number
argument in
caret::trainControl
, or by setting method = "LOOCV"
for leave one out cross-validation where the number of folds is
equal to the number of data points.
For less variable, but more computationally costly, cross-validation,
users may alternatively choose method = "repeatedcv"
for
repeated k-fold cross-validation.
For more control, advanced users may wish to call
caret::train
directly, using makemethod_train_smooth_data
to
specify the method
argument.
Value
If return_trainobject = FALSE
(the default), a data frame
with the values of all tuning parameter combinations and the
training error rate for each combination (i.e. the results
element of the output of caret::train
).
If return_trainobject = TRUE
, the output of caret::train