rfDROP2 {rgnoisefilt}R Documentation

Decremental Reduction Optimization Procedure for Regression

Description

Application of the rfDROP2 noise filtering method in a regression dataset.

Usage

## Default S3 method:
rfDROP2(x, y, k = 5, ...)

## S3 method for class 'formula'
rfDROP2(formula, data, ...)

Arguments

x

a data frame of input attributes.

y

a double vector with the output regressand of each sample.

k

an integer with the number of nearest neighbors to be used (default: 5).

...

other options to pass to the function.

formula

a formula with the output regressand and, at least, one input attribute.

data

a data frame in which to interpret the variables in the formula.

Details

rfDROP2 tests the prediction of an edited dataset S over the original dataset T. The noise filter removes an instance p only if its exclusion does not increase the prediction error of its associates. This is measured by comparing the accumulation of errors with and without p in the dataset.

Value

The result of applying the regression filter is a reduced dataset containing the clean samples (without errors or noise), since it removes noisy samples (those with errors). This function returns an object of class rfdata, which contains information related to the noise filtering process in the form of a list with the following elements:

xclean

a data frame with the input attributes of clean samples (without errors).

yclean

a double vector with the output regressand of clean samples (without errors).

numclean

an integer with the amount of clean samples.

idclean

an integer vector with the indices of clean samples.

xnoise

a data frame with the input attributes of noisy samples (with errors).

ynoise

a double vector with the output regressand of noisy samples (with errors).

numnoise

an integer with the amount of noisy samples.

idnoise

an integer vector with the indices of noisy samples.

filter

the full name of the noise filter used.

param

a list of the argument values.

call

the function call.

Note that objects of the class rfdata support print.rfdata, summary.rfdata and plot.rfdata methods.

References

A. Arnaiz-González, J. Díez-Pastor, J. Rodríguez, C. García-Osorio, Instance selection for regression: Adapting DROP., Neurocomputing, 201:66-81, 2016. doi:10.1016/j.neucom.2016.04.003.

D. Randall, T. Martinez, Instance pruning techniques. Machine Learning: Proceedings of the Fourteenth International Conference, 404–411, 1997.

See Also

rfDROP3, regRNN, regCNN, print.rfdata, summary.rfdata

Examples

# load the dataset
data(rock)

# usage of the default method
set.seed(9)
out.def <- rfDROP2(x = rock[,-ncol(rock)], y = rock[,ncol(rock)])

# show results
summary(out.def, showid = TRUE)

# usage of the method for class formula
set.seed(9)
out.frm <- rfDROP2(formula = perm ~ ., data = rock)

# check the match of noisy indices
all(out.def$idnoise == out.frm$idnoise)


[Package rgnoisefilt version 1.1.2 Index]