smoothness {ECoL}R Documentation

Measures of smoothness

Description

Regression task. In regression problems, the smoother the function to be fitted to the data, the simpler it shall be. Larger variations in the inputs and/or outputs, on the other hand, usually indicate the existence of more intricate relationships between them.

Usage

smoothness(...)

## Default S3 method:
smoothness(x, y, measures = "all",
  summary = c("mean", "sd"), ...)

## S3 method for class 'formula'
smoothness(formula, data, measures = "all",
  summary = c("mean", "sd"), ...)

Arguments

...

Not used.

x

A data.frame contained only the input attributes.

y

A response vector with one value for each row/component of x.

measures

A list of measures names or "all" to include all them.

summary

A list of summarization functions or empty for all values. See summarization method to more information. (Default: c("mean", "sd"))

formula

A formula to define the output column.

data

A data.frame dataset contained the input and output attributes.

Details

The following measures are allowed for this method:

"S1"

Output distribution (S1) monitors whether the examples joined in the MST have similar output values. Lower values indicate simpler problems, where the outputs of similar examples in the input space are also next to each other.

"S2"

Input distribution (S2) measure how similar in the input space are data items with similar outputs based on distance.

"S3"

Error of a nearest neighbor regressor (S3) calculates the mean squared error of a 1-nearest neighbor regressor using leave-one-out.

"S4"

Non-linearity of nearest neighbor regressor (S4) calculates the mean squared error of a 1-nearest neighbor regressor to the new randomly interpolated points.

Value

A list named by the requested smoothness measure.

References

Ana C Lorena and Aron I Maciel and Pericles B C Miranda and Ivan G Costa and Ricardo B C Prudencio. (2018). Data complexity meta-features for regression problems. Machine Learning, 107, 1, 209–246.

See Also

Other complexity-measures: balance, correlation, dimensionality, linearity, neighborhood, network, overlapping

Examples

## Extract all smoothness measures for regression task
data(cars)
smoothness(speed ~ ., cars)

[Package ECoL version 0.3.0 Index]