gensvm-package {gensvm} | R Documentation |
GenSVM: A Generalized Multiclass Support Vector Machine
Description
The GenSVM classifier is a generalized multiclass support vector machine (SVM). This classifier aims to find decision boundaries that separate the classes with as wide a margin as possible. In GenSVM, the loss functions that measures how misclassifications are counted is very flexible. This allows the user to tune the classifier to the dataset at hand and potentially obtain higher classification accuracy. Moreover, this flexibility means that GenSVM has a number of alternative multiclass SVMs as special cases. One of the other advantages of GenSVM is that it is trained in the primal space, allowing the use of warm starts during optimization. This means that for common tasks such as cross validation or repeated model fitting, GenSVM can be trained very quickly.
Details
This package provides functions for training the GenSVM model either as a separate model or through a cross-validated parameter grid search. In both cases the GenSVM C library is used for speed. Auxiliary functions for evaluating and using the model are also provided.
GenSVM functions
The main GenSVM functions are:
gensvm
Fit a GenSVM model for specific model parameters.
gensvm.grid
Run a cross-validated grid search for GenSVM.
For the GenSVM and GenSVMGrid models the following two functions are available. When applied to a GenSVMGrid object, the function is applied to the best GenSVM model.
plot
Plot the low-dimensional simplex space where the decision boundaries are fixed (for problems with 3 classes).
predict
Predict the class labels of new data using the GenSVM model.
Moreover, for the GenSVM and GenSVMGrid models a coef
function is
defined:
coef.gensvm
Get the coefficients of the fitted GenSVM model.
coef.gensvm.grid
Get the parameter grid of the GenSVM grid search.
The following utility functions are also included:
gensvm.accuracy
Compute the accuracy score between true and predicted class labels
gensvm.maxabs.scale
Scale each column of the dataset by its maximum absolute value, preserving sparsity and mapping the data to [-1, 1]
gensvm.train.test.split
Split a dataset into a training and testing sample
gensvm.refit
Refit a fitted GenSVM model with slightly different parameters or on a different dataset
Kernels in GenSVM
GenSVM can be used for both linear and nonlinear multiclass support vector machine classification. In general, linear classification will be faster but depending on the dataset higher classification performance can be achieved using a nonlinear kernel.
The following nonlinear kernels are implemented in the GenSVM package:
- RBF
The Radial Basis Function kernel is a well-known kernel function based on the Euclidean distance between objects. It is defined as
k(x_i, x_j) = exp( -\gamma || x_i - x_j ||^2 )
- Polynomial
A polynomial kernel can also be used in GenSVM. This kernel function is implemented very generally and therefore takes three parameters (
coef
,gamma
, anddegree
). It is defined as:k(x_i, x_j) = ( \gamma x_i' x_j + coef)^{degree}
- Sigmoid
The sigmoid kernel is the final kernel implemented in GenSVM. This kernel has two parameters and is implemented as follows:
k(x_i, x_j) = \tanh( \gamma x_i' x_j + coef)
Author(s)
Gerrit J.J. van den Burg, Patrick J.F. Groenen
Maintainer: Gerrit J.J. van den Burg <gertjanvandenburg@gmail.com>
References
Van den Burg, G.J.J. and Groenen, P.J.F. (2016). GenSVM: A Generalized Multiclass Support Vector Machine, Journal of Machine Learning Research, 17(225):1–42. URL https://jmlr.org/papers/v17/14-526.html.