cv.irsvm_fit {mpath}R Documentation

Internal function of cross-validation for irsvm

Description

Internal function to conduct k-fold cross-validation for irsvm

Usage

cv.irsvm_fit(x, y, weights, cfun="ccave", s=c(1, 5), type=NULL, 
             kernel="radial", gamma=2^(-4:10), cost=2^(-4:4), 
             epsilon=0.1, balance=TRUE, nfolds=10, foldid, 
             trim_ratio=0.9, n.cores=2, ...)

Arguments

x

a data matrix, a vector, or a sparse 'design matrix' (object of class Matrix provided by the Matrix package, or of class matrix.csr provided by the SparseM package, or of class simple_triplet_matrix provided by the slam package).

y

a response vector with one label for each row/component of x. Can be either a factor (for classification tasks) or a numeric vector (for regression).

weights

the weight of each subject. It should be in the same length of y.

cfun

character, type of convex cap (concave) function.
Valid options are:

  • "hcave"

  • "acave"

  • "bcave"

  • "ccave"

  • "dcave"

  • "ecave"

  • "gcave"

  • "tcave"

s

tuning parameter of cfun. s > 0 and can be equal to 0 for cfun="tcave". If s is too close to 0 for cfun="acave", "bcave", "ccave", the calculated weights can become 0 for all observations, thus crash the program.

type

irsvm can be used as a classification machine, or as a regression machine. Depending of whether y is a factor or not, the default setting for type is C-classification or eps-regression, respectively, but may be overwritten by setting an explicit value.
Valid options are:

  • C-classification

  • nu-classification

  • eps-regression

  • nu-regression

kernel, gamma

the kernel used in training and predicting. You might consider changing some of the following parameters, depending on the kernel type.

linear:

u'v

polynomial:

(\gamma u'v + coef0)^{degree}

radial basis:

e^(-\gamma |u-v|^2)

sigmoid:

tanh(\gamma u'v + coef0)

cost

cost of constraints violation (default: 1)—it is the ‘C’-constant of the regularization term in the Lagrange formulation. This is proportional to the inverse of lambda in irglmreg.

epsilon

epsilon in the insensitive-loss function (default: 0.1)

balance

for type="C-classification", "nu-classification" only

nfolds

number of folds >=3, default is 10

foldid

an optional vector of values between 1 and nfold identifying what fold each observation is in. If supplied, nfold can be missing and will be ignored.

trim_ratio

a number between 0 and 1 for trimmed least squares, useful if type="eps-regression" or "nu-regression".

n.cores

The number of CPU cores to use. The cross-validation loop will attempt to send different CV folds off to different cores.

...

Other arguments that can be passed to irsvm.

Details

This function is the driving force behind cv.irsvm. Does a K-fold cross-validation to determine optimal tuning parameters in SVM: cost and gamma if kernel is nonlinear. It can also choose s used in cfun.

Value

an object of class "cv.irsvm" is returned, which is a list with the ingredients of the cross-validation fit.

residmat

matrix with row values for kernel="linear" are s, cost, error, k, where k is the number of cross-validation fold. For nonlinear kernels, row values are s, gamma, cost, error, k.

cost

a value of cost that gives minimum cross-validated value in irsvm.

gamma

a value of gamma that gives minimum cross-validated value in irsvm

s

value of s for cfun that gives minimum cross-validated value in irsvm.

Author(s)

Zhu Wang <zwang145@uthsc.edu>

References

Zhu Wang (2024) Unified Robust Estimation, Australian & New Zealand Journal of Statistics. 66(1):77-102.

See Also

cv.irsvm and irsvm


[Package mpath version 0.4-2.26 Index]