R: Random Machines

randomMachines {randomMachines}

R Documentation

Random Machines

Description

Random Machines is an ensemble model which uses the combination of different kernel functions to improve the diversity in the bagging approach, improving the predictions in general. Random Machines was developed for classification and regression problems by bagging multiple kernel functions in support vector models.

Random Machines uses SVMs (Cortes and Vapnik, 1995) as base learners in the bagging procedure with a random sample of kernel functions to build them.

Let a training sample given by (\boldsymbol{x_{i}},y_i) with i=1,\dots, n observations, where \boldsymbol{x_{i}} is the vector of independent variables and y_{i} the dependent one. The kernel bagging method initializes by training of the r single learner, where r=1,\dots,R and R is the total number of different kernel functions that could be used in support vector models. In this implementation the default value is R=4 (gaussian, polynomial, laplacian and linear). See more details below.

Each single learner is internally validated and the weights \lambda_{r} are calculated proportionally to the strength from the single predictive performance.

Afterwards, B bootstrap samples are sampled from the training set. A support vector machine model g_{b} is trained for each bootstrap sample, b=i,\dots,B and the kernel function that will be used for g_{b} will be determined by a random choice with probability \lambda_{r}. The final weight w_b in the bagging procedure is calculated by out-of-bag samples.

The final model G(\boldsymbol{x}_i) for a new \boldsymbol{x}_i is given by,

The weights \lambda_{r} and w_b are different calculated for each task (classification, probabilistic classification and regression). See more details in the references.

For a binary classification problem \mathbin{{ G(\boldsymbol{x_{i}})= \text{sgn} \left( \sum_{b=1}^{B}w_{b}g_{b}(\boldsymbol{x_{i}})\right)}}, where g_b are single binary classification outputs;
For a probabilistic binary classification problem \mathbin{{ G(\boldsymbol{x_{i}})= \sum_{b=1}^{B}w_{b}g_{b}(\boldsymbol{x_{i}})}}, where g_b are single probabilistic classification outputs;
For a regression problem G(\boldsymbol{x_{i}})= \sum_{b=1}^{B}w_{b}g_{b}(\boldsymbol{x_{i}}), , where g_b are single regression outputs.

Usage

randomMachines(
     formula,
     train,validation,
     B = 25, cost = 1,
     automatic_tuning = FALSE,
     gamma_rbf = 1,
     gamma_lap = 1,
     degree = 2,
     poly_scale = 1,
     offset = 0,
     gamma_cau = 1,
     d_t = 2,
     kernels = c("rbfdot", "polydot", "laplacedot", "vanilladot"),
     prob_model = TRUE,
     loss_function = RMSE,
     epsilon = 0.1,
     beta = 2
)

Arguments

`formula`	an object of class `formula`: it should contain a symbolic description of the model to be fitted, indicating the dependent variable and all predictors that should be included.
`train`	the training data `\left\{\left( \mathbf{x}_{i},y_{i} \right)\right\}_{i=1}^{n}` used to train the model.
`validation`	the validation data `\left\{\left( \mathbf{x}_{i},y_{i}\right) \right\}_{i=1}^{V}` used to calculate probabilities `\lambda_{r}`. If `validation = NULL`,the validation set is going be selected as 0.25 partition from the training data, and the remaining partition is selected as the new training sample.
`B`	number of bootstrap samples. The default value is `B=25`.
`cost`	the `C`-constant term of the regularization on soft margins at support vector models. The default value is `cost=1`.
`automatic_tuning`	boolean to define if the kernel hyperparameters will be selected using the `sigest` from the `ksvm` function. The default value is `FALSE`.
`gamma_rbf`	the hyperparameter `\gamma_{g}` used in the RBF kernel. The default value is `gamma_rbf=1`.
`gamma_lap`	the hyperparameter `\gamma_{l}` used in the Laplacian kernel. The default value is `gamma_lap=1`.
`degree`	the degree used in the Polynomial kernel. The default value is `degree=2`.
`poly_scale`	the scale parameter from the Polynomial kernel. The default value is `poly_scale=1`.
`offset`	the offset parameter from the Polynomial kernel. The default value is `offset=0`.
`gamma_cau`	the hyperparameter `\gamma_{c}` used in the Cauchy kernel. The default value is `gamma_cau=1`.
`d_t`	the `d_{t}`-norm from the t-Student kernel. The default value is `d_t=2`.
`kernels`	a vector with the name of kernel functions that will be used in the Random Machines model. The default include the kernel functions: `c("rbfdot", "polydot", "laplacedot", "vanilladot").` The other kernel functions as `"cauchydot"` and `"tdot"` are exclusive to the binary classification setting.
`prob_model`	a boolean to define if the algorithm will be using a probabilistic approach to the define the predictions (default = `TRUE`).
`loss_function`	Define which loss function is going to be used in the regression approach. The default is the `RMSE` function but others can be added.
`epsilon`	The epsilon in the loss function used from the SVR implementation. The default value is `epsilon=0.1`.
`beta`	The correlation parameter `\beta` which calibrates the penalisation of each kernel performance in regression tasks. The default value is `beta=2`.

Details

The Random Machines is an ensemble method which combines the bagging procedure proposed by Breiman (1996), using Support Vector Machine models as base learners jointly with a random selection of kernel functions that add diversity to the ensemble without harming its predictive performance. The kernel functions k(x,y) are described by the functions below,

Linear Kernel: k(x,y) = (x\cdot y)
Polynomial Kernel: k(x,y) = \left(scale( x\cdot y) + offset\right)^{degree}
Gaussian Kernel: k(x,y) = e^{-\gamma_{g}||x-y||^2}
Laplacian Kernel: k(x,y) = e^{-\gamma_{\ell}||x-y||}
Cauchy Kernel: k(x,y) = \frac{1}{1 + \frac{||x-y||^{2}}{\gamma_{c}}}
Student's t Kernel: k(x,y) = \frac{1}{1 + ||x-y||^{d_{t}}}

Value

randomMachines() returns an object of class "rm_class" for classification tasks or "rm_reg" for if the target variable is a continuous numerical response. See predict.rm_class or predict.rm_reg for more details of how to obtain predictions from each model respectively.

Author(s)

Mateus Maia: mateusmaia11@gmail.com, Gabriel Felipe Ribeiro: brielribeiro08@gmail.com, Anderson Ara: ara@ufpr.br

References

Ara, Anderson, et al. "Regression random machines: An ensemble support vector regression model with free kernel choice." Expert Systems with Applications 202 (2022): 117107.

Ara, Anderson, et al. "Random machines: A bagged-weighted support vector model with free kernel choice." Journal of Data Science 19.3 (2021): 409-428.

Breiman, L. (1996). Bagging predictors. Machine learning, 24, 123-140.

Cortes, C., and Vapnik, V. (1995). Support-vector networks. Machine learning, 20, 273-297.

Maia, Mateus, Arthur R. Azevedo, and Anderson Ara. "Predictive comparison between random machines and random forests." Journal of Data Science 19.4 (2021): 593-614.

Examples

library(randomMachines)

# Simulation from a binary output context
sim_data <- sim_class(n = 75)

## Setting the training and validation set
sim_new <- sim_class(n = 75)

# Modelling Random Machines (probabilistic output)
rm_mod_prob <- randomMachines(y~., train = sim_data)

## Modelling Random Machines (binary class output)
rm_mod_label <- randomMachines(y~., train = sim_data,prob_model = FALSE)

## Predicting for new data
y_hat <- predict(rm_mod_label,sim_new)

[Package randomMachines version 0.1.0 Index]