R: SPSA-FSR for Feature Selection and Ranking

spFeatureSelection {spFSR}

R Documentation

SPSA-FSR for Feature Selection and Ranking

Description

This function searches for the best performing features and rank the feature importance by implementing simultaneous perturbation stochastic approximation (SPSA) algorithm given a task and a wrapper. The task and wrapper are defined using the mlr3 package.

Usage

spFeatureSelection(task, wrapper = NULL, scoring = NULL, ...)

Arguments

`task`	A `task` object created using mlr3 package. It must be either a `ClassifTask` or `RegrTask` object.
`wrapper`	A `Learner` object created using mlr3 package. Multiple learners object is not supported.
`scoring`	A performance measure within the mlr3 package supported by the `task`.
`...`	Additional arguments. For more details, see spFSR.default.

Value

spFSR returns an object of class "spFSR". An object of class "spFSR" consists of the following:

`task.spfs`	An mlr3 package `tsk` object defined on the best performing features.
`wrapper`	An mlr3 package `lrn` object, default is random forest.
`scoring`	An mlr3 package `msr` object as specified by the user.
`param best.model`	An mlr3 package `model` object trained by the `wrapper` using `task.spfs`.
`iter.results`	A `data.frame` object containing detailed information on each iteration.
`features`	Names of the best performing features.
`num.features`	The number of best performing features.
`importance`	A vector of importance ranks of the best performing features.
`total.iters`	The total number of iterations executed.
`best.iter`	The iteration where the best performing feature subset was encountered.
`best.value`	The best measure value encountered during execution.
`best.std`	The standard deviation corresponding to the best measure value encountered.
`run.time`	Total run time in minutes
`results`	Dataframe with boolean of selected features, names and measure
`call`	Call

References

David V. Akman et al. (2022) k-best feature selection and ranking via stochastic approximation, Expert Systems with Applications, Vol. 213. See doi:10.1016/j.eswa.2022.118864

G.F.A Yeo and V. Aksakalli (2021) A stochastic approximation approach to simultaneous feature weighting and selection for nearest neighbour learners, Expert Systems with Applications, Vol. 185. See doi:10.1016/j.eswa.2021.115671

Examples

library(mlr3)          # load the mlr3 package
library(mlr3learners) # load the mlr3learners package

task    <- tsk('iris') # define task
wrapper <- lrn('classif.rpart')                # define wrapper
measure <- msr('classif.acc')

# run spsa
spsaMod <- spFeatureSelection( task = task,
                               wrapper = wrapper,
                               scoring = measure,
                               num.features.selected = 3,
                               n.jobs = 1,
                               iters.max = 2,
                               num.grad.avg = 1)


# obtain summary
summary(spsaMod)

# plot spsaMod
plot(spsaMod)                                # simplest plot
plot(spsaMod, errorBar = TRUE)               # plot with error bars
plot(spsaMod, errorBar = TRUE, se = TRUE)    # plot with error bars based on se
plot(spsaMod, errorBar = TRUE, annotateBest = TRUE)  # annotate best value
plot(spsaMod, errorBar = TRUE, ylab = 'Acc measure', type = 'o')

# obtain the wrapped model with the best performing features
bestMod <- getBestModel(spsaMod)

# predict using the best mod
pred <- bestMod$predict( task = spsaMod$task.spfs )

# Obtain confusion matrix
pred$confusion

# Get the importance ranks of best performing features
getImportance(spsaMod)
plotImportance(spsaMod)

[Package spFSR version 2.0.4 Index]