spFeatureSelection {spFSR}R Documentation

SPSA-FSR for Feature Selection and Ranking

Description

This function searches for the best performing features and rank the feature importance by implementing simultaneous perturbation stochastic approximation (SPSA) algorithm given a task and a wrapper. The task and wrapper are defined using the mlr3 package.

Usage

spFeatureSelection(task, wrapper = NULL, scoring = NULL, ...)

Arguments

task

A task object created using mlr3 package. It must be either a ClassifTask or RegrTask object.

wrapper

A Learner object created using mlr3 package. Multiple learners object is not supported.

scoring

A performance measure within the mlr3 package supported by the task.

...

Additional arguments. For more details, see spFSR.default.

Value

spFSR returns an object of class "spFSR". An object of class "spFSR" consists of the following:

task.spfs

An mlr3 package tsk object defined on the best performing features.

wrapper

An mlr3 package lrn object, default is random forest.

scoring

An mlr3 package msr object as specified by the user.

param best.model

An mlr3 package model object trained by the wrapper using task.spfs.

iter.results

A data.frame object containing detailed information on each iteration.

features

Names of the best performing features.

num.features

The number of best performing features.

importance

A vector of importance ranks of the best performing features.

total.iters

The total number of iterations executed.

best.iter

The iteration where the best performing feature subset was encountered.

best.value

The best measure value encountered during execution.

best.std

The standard deviation corresponding to the best measure value encountered.

run.time

Total run time in minutes

results

Dataframe with boolean of selected features, names and measure

call

Call

References

David V. Akman et al. (2022) k-best feature selection and ranking via stochastic approximation, Expert Systems with Applications, Vol. 213. See doi:10.1016/j.eswa.2022.118864

G.F.A Yeo and V. Aksakalli (2021) A stochastic approximation approach to simultaneous feature weighting and selection for nearest neighbour learners, Expert Systems with Applications, Vol. 185. See doi:10.1016/j.eswa.2021.115671

See Also

tsk, lrn, msr and spFSR.default.

Examples

library(mlr3)          # load the mlr3 package
library(mlr3learners) # load the mlr3learners package

task    <- tsk('iris') # define task
wrapper <- lrn('classif.rpart')                # define wrapper
measure <- msr('classif.acc')

# run spsa
spsaMod <- spFeatureSelection( task = task,
                               wrapper = wrapper,
                               scoring = measure,
                               num.features.selected = 3,
                               n.jobs = 1,
                               iters.max = 2,
                               num.grad.avg = 1)


# obtain summary
summary(spsaMod)

# plot spsaMod
plot(spsaMod)                                # simplest plot
plot(spsaMod, errorBar = TRUE)               # plot with error bars
plot(spsaMod, errorBar = TRUE, se = TRUE)    # plot with error bars based on se
plot(spsaMod, errorBar = TRUE, annotateBest = TRUE)  # annotate best value
plot(spsaMod, errorBar = TRUE, ylab = 'Acc measure', type = 'o')

# obtain the wrapped model with the best performing features
bestMod <- getBestModel(spsaMod)

# predict using the best mod
pred <- bestMod$predict( task = spsaMod$task.spfs )

# Obtain confusion matrix
pred$confusion

# Get the importance ranks of best performing features
getImportance(spsaMod)
plotImportance(spsaMod)



[Package spFSR version 2.0.4 Index]