linearity {ECoL} R Documentation

## Measures of linearity

### Description

The linearity measures try to quantify if it is possible to separate the labels by a hyperplane or linear function. The underlying assumption is that a linearly separable problem can be considered simpler than a problem requiring a non-linear decision boundary.

### Usage

linearity(...)

## Default S3 method:
linearity(x, y, measures = "all", summary = c("mean",
"sd"), ...)

## S3 method for class 'formula'
linearity(formula, data, measures = "all",
summary = c("mean", "sd"), ...)


### Arguments

 ... Not used. x A data.frame contained only the input attributes. y A response vector with one value for each row/component of x. measures A list of measures names or "all" to include all them. summary A list of summarization functions or empty for all values. See summarization method to more information. (Default: c("mean", "sd")) formula A formula to define the output column. data A data.frame dataset contained the input attributes and class.

### Details

The following classification measures are allowed for this method:

"L1"

Sum of the error distance by linear programming (L1) computes the sum of the distances of incorrectly classified examples to a linear boundary used in their classification.

"L2"

Error rate of linear classifier (L2) computes the error rate of the linear SVM classifier induced from dataset.

"L3"

Non-linearity of a linear classifier (L3) creates a new dataset randomly interpolating pairs of training examples of the same class and then induce a linear SVM on the original data and measure the error rate in the new data points.

The following regression measures are allowed for this method:

"L1"

Mean absolute error (L1) averages the absolute values of the residues of a multiple linear regressor.

"L2"

Residuals variance (L2) averages the square of the residuals from a multiple linear regression.

"L3"

Non-linearity of a linear regressor (L3) measures how sensitive the regressor is to the new randomly interpolated points.

### Value

A list named by the requested linearity measure.

### References

Albert Orriols-Puig, Nuria Macia and Tin K Ho. (2010). Documentation for the data complexity library in C++. Technical Report. La Salle - Universitat Ramon Llull.

Other complexity-measures: balance, correlation, dimensionality, neighborhood, network, overlapping, smoothness

### Examples

## Extract all linearity measures for classification task
data(iris)
linearity(Species ~ ., iris)

## Extract all linearity measures for regression task
data(cars)
linearity(speed ~ ., cars)


[Package ECoL version 0.3.0 Index]