mlr_filters_correlation {mlr3filters} | R Documentation |
Correlation Filter
Description
Simple correlation filter calling stats::cor()
.
The filter score is the absolute value of the correlation.
Super class
mlr3filters::Filter
-> FilterCorrelation
Methods
Public methods
Inherited methods
Method new()
Create a FilterCorrelation object.
Usage
FilterCorrelation$new()
Method clone()
The objects of this class are cloneable with this method.
Usage
FilterCorrelation$clone(deep = FALSE)
Arguments
deep
Whether to make a deep clone.
Note
This filter, in its default settings, can handle missing values in the features. However, the resulting filter scores may be misleading or at least difficult to compare if some features have a large proportion of missing values.
If a feature has no non-missing value, the resulting score will be NA
.
Missing scores appear in a random, non-deterministic order at the end of the vector of scores.
References
For a benchmark of filter methods:
Bommert A, Sun X, Bischl B, Rahnenführer J, Lang M (2020). “Benchmark for filter methods for feature selection in high-dimensional classification data.” Computational Statistics & Data Analysis, 143, 106839. doi:10.1016/j.csda.2019.106839.
See Also
-
PipeOpFilter for filter-based feature selection.
Other Filter:
Filter
,
mlr_filters
,
mlr_filters_anova
,
mlr_filters_auc
,
mlr_filters_boruta
,
mlr_filters_carscore
,
mlr_filters_carsurvscore
,
mlr_filters_cmim
,
mlr_filters_disr
,
mlr_filters_find_correlation
,
mlr_filters_importance
,
mlr_filters_information_gain
,
mlr_filters_jmi
,
mlr_filters_jmim
,
mlr_filters_kruskal_test
,
mlr_filters_mim
,
mlr_filters_mrmr
,
mlr_filters_njmim
,
mlr_filters_performance
,
mlr_filters_permutation
,
mlr_filters_relief
,
mlr_filters_selected_features
,
mlr_filters_univariate_cox
,
mlr_filters_variance
Examples
## Pearson (default)
task = mlr3::tsk("mtcars")
filter = flt("correlation")
filter$calculate(task)
as.data.table(filter)
## Spearman
filter = FilterCorrelation$new()
filter$param_set$values = list("method" = "spearman")
filter$calculate(task)
as.data.table(filter)
if (mlr3misc::require_namespaces(c("mlr3pipelines", "rpart"), quietly = TRUE)) {
library("mlr3pipelines")
task = mlr3::tsk("boston_housing")
# Note: `filter.frac` is selected randomly and should be tuned.
graph = po("filter", filter = flt("correlation"), filter.frac = 0.5) %>>%
po("learner", mlr3::lrn("regr.rpart"))
graph$train(task)
}