cuda_ml_sgd {cuda.ml} | R Documentation |
Train a MBSGD linear model.
Description
Train a linear model using mini-batch stochastic gradient descent.
Usage
cuda_ml_sgd(x, ...)
## Default S3 method:
cuda_ml_sgd(x, ...)
## S3 method for class 'data.frame'
cuda_ml_sgd(
x,
y,
fit_intercept = TRUE,
loss = c("squared_loss", "log", "hinge"),
penalty = c("none", "l1", "l2", "elasticnet"),
alpha = 1e-04,
l1_ratio = 0.5,
epochs = 1000L,
tol = 0.001,
shuffle = TRUE,
learning_rate = c("constant", "invscaling", "adaptive"),
eta0 = 0.001,
power_t = 0.5,
batch_size = 32L,
n_iters_no_change = 5L,
...
)
## S3 method for class 'matrix'
cuda_ml_sgd(
x,
y,
fit_intercept = TRUE,
loss = c("squared_loss", "log", "hinge"),
penalty = c("none", "l1", "l2", "elasticnet"),
alpha = 1e-04,
l1_ratio = 0.5,
epochs = 1000L,
tol = 0.001,
shuffle = TRUE,
learning_rate = c("constant", "invscaling", "adaptive"),
eta0 = 0.001,
power_t = 0.5,
batch_size = 32L,
n_iters_no_change = 5L,
...
)
## S3 method for class 'formula'
cuda_ml_sgd(
formula,
data,
fit_intercept = TRUE,
loss = c("squared_loss", "log", "hinge"),
penalty = c("none", "l1", "l2", "elasticnet"),
alpha = 1e-04,
l1_ratio = 0.5,
epochs = 1000L,
tol = 0.001,
shuffle = TRUE,
learning_rate = c("constant", "invscaling", "adaptive"),
eta0 = 0.001,
power_t = 0.5,
batch_size = 32L,
n_iters_no_change = 5L,
...
)
## S3 method for class 'recipe'
cuda_ml_sgd(
x,
data,
fit_intercept = TRUE,
loss = c("squared_loss", "log", "hinge"),
penalty = c("none", "l1", "l2", "elasticnet"),
alpha = 1e-04,
l1_ratio = 0.5,
epochs = 1000L,
tol = 0.001,
shuffle = TRUE,
learning_rate = c("constant", "invscaling", "adaptive"),
eta0 = 0.001,
power_t = 0.5,
batch_size = 32L,
n_iters_no_change = 5L,
...
)
Arguments
x |
Depending on the context: * A __data frame__ of predictors. * A __matrix__ of predictors. * A __recipe__ specifying a set of preprocessing steps * created from [recipes::recipe()]. * A __formula__ specifying the predictors and the outcome. |
... |
Optional arguments; currently unused. |
y |
A numeric vector (for regression) or factor (for classification) of desired responses. |
fit_intercept |
If TRUE, then the model tries to correct for the global mean of the response variable. If FALSE, then the model expects data to be centered. Default: TRUE. |
loss |
Loss function, must be one of "squared_loss", "log", "hinge". |
penalty |
Type of regularization to perform, must be one of "none", "l1", "l2", "elasticnet". - "none": no regularization. - "l1": perform regularization based on the L1-norm (LASSO) which tries to minimize the sum of the absolute values of the coefficients. - "l2": perform regularization based on the L2 norm (Ridge) which tries to minimize the sum of the square of the coefficients. - "elasticnet": perform the Elastic Net regularization which is based on the weighted averable of L1 and L2 norms. Default: "none". |
alpha |
Multiplier of the penalty term. Default: 1e-4. |
l1_ratio |
The ElasticNet mixing parameter, with 0 <= l1_ratio <= 1.
For l1_ratio = 0 the penalty is an L2 penalty. For l1_ratio = 1 it is an L1
penalty.
For 0 < l1_ratio < 1, the penalty is a combination of L1 and L2.
The penalty term is computed using the following formula:
penalty = |
epochs |
The number of times the model should iterate through the entire dataset during training. Default: 1000L. |
tol |
Threshold for stopping training. Training will stop if
(loss in current epoch) > (loss in previous epoch) - |
shuffle |
Whether to shuffles the training data after each epoch. Default: True. |
learning_rate |
Must be one of "constant", "invscaling", "adaptive". - "constant": the learning rate will be kept constant.
- "invscaling": (learning rate) = (initial learning rate) / pow(t, power_t)
where |
eta0 |
The initial learning rate. Default: 1e-3. |
power_t |
The exponent used in the invscaling learning rate calculations. |
batch_size |
The number of samples that will be included in each batch. Default: 32L. |
n_iters_no_change |
The maximum number of epochs to train if there is no imporvement in the model. Default: 5. |
formula |
A formula specifying the outcome terms on the left-hand side, and the predictor terms on the right-hand side. |
data |
When a __recipe__ or __formula__ is used, |
Value
A linear model that can be used with the 'predict' S3 generic to make predictions on new data points.
Examples
library(cuda.ml)
model <- cuda_ml_sgd(
mpg ~ ., mtcars,
batch_size = 4L, epochs = 50000L,
learning_rate = "adaptive", eta0 = 1e-5,
penalty = "l2", alpha = 1e-5, tol = 1e-6,
n_iters_no_change = 10L
)
preds <- predict(model, mtcars[names(mtcars) != "mpg"])
print(all.equal(preds$.pred, mtcars$mpg, tolerance = 0.09))