spFSR.default {spFSR} | R Documentation |
Default Function of SP-FSR for Feature Selection and Ranking
Description
This is the default function of spFeatureSelection. See spFeatureSelection for example.
Usage
spFSR.default(
task,
wrapper = NULL,
scoring = NULL,
perturb.amount = 0.05,
gain.min = 0.01,
gain.max = 2,
change.min = 0,
change.max = 0.2,
bb.bottom.threshold = 10^(-8),
mon.gain.A = 100,
mon.gain.a = 0.75,
mon.gain.alpha = 0.6,
hot.start.num.ft.factor = 15,
hot.start.max.auto.num.ft = 150,
use.hot.start = TRUE,
hot.start.range = 0.2,
rf.n.estimators = 50,
gain.type = "bb",
num.features.selected = 0L,
iters.max = 100L,
stall.limit = 35L,
n.samples.max = 5000,
ft.weighting = FALSE,
encoding.type = "encode",
is.debug = FALSE,
stall.tolerance = 10^(-8),
random.state = 1,
rounding = 3,
run.parallel = TRUE,
n.jobs = NULL,
show.info = TRUE,
print.freq = 10L,
num.cv.folds = 5L,
num.cv.reps.eval = 3L,
num.cv.reps.grad = 1L,
num.grad.avg = 4L,
perf.eval.method = "cv"
)
Arguments
task |
A task |
wrapper |
A Learner |
scoring |
A performance measure |
perturb.amount |
Perturbation amount for feature importances during gradient approximation. It must be a value between 0.01 and 0.1. Default value is 0.05. |
gain.min |
The minimum gain value. It must be greater than or equal to 0.001. Default value is 0.01. |
gain.max |
The maximum gain value. It must be greater than or equal to |
change.min |
The minimum change value. It must be non-negative. Default value is 0.0. |
change.max |
The maximum change value. It must be greater than |
bb.bottom.threshold |
The threshold value of denominator for the Barzilai-Borwein gain sequence. It must be positive. Default is 1/10^8. |
mon.gain.A |
Parameter for the monetone gain sequence. It must be a positive integer. Default is 100. |
mon.gain.a |
Parameter for the monetone gain sequence. It must be positive. Default is 0.75. |
mon.gain.alpha |
Parameter for the monetone gain sequence. It must be between (0, 1). Default is 0.6. |
hot.start.num.ft.factor |
The factor of features to select for hot start. Must be an integer greater than 1. Default is 15. |
hot.start.max.auto.num.ft |
The maximum initial number of features for automatic hot start. Must be an integer greater than 1. Default is 75. |
use.hot.start |
Logical argument. Whether hot start should be used. Default is True. |
hot.start.range |
Float, the initial range of imputations carried over from hot start. It must be between (0,1). Default is 0.2. |
rf.n.estimators |
integer, The number of trees to use in the random forest hot start. The default is 50. |
gain.type |
The gain sequence to use. Accepted methods are 'bb' for Barzilai-Borwein or 'mon' for a monetonic gain sequence. Default is 'bb'. |
num.features.selected |
Number of features selected. It must be a nonnegative integer and must not exceed the total number of features in the task. A value of 0 results in automatic feature selection. Default value is 0L. |
iters.max |
Maximum number of iterations to execute. The minimum value is 2L. Default value is 300L. |
stall.limit |
Number of iterations to stall, that is, to continue without at least |
n.samples.max |
The maximum number of samples to select from sampling. It must be a non-negative integer. Default is 2500. |
ft.weighting |
Logical argument. Include simultaneous feature weighting and selection?. Default is FALSE. |
encoding.type |
Encoding method for factor features for feature weighting, default is 'encoded'. |
is.debug |
Logical argument. Print additional debug messages? Default value is FALSE. |
stall.tolerance |
Value of stall tolerance. It must be strictly positive. Default value is 1/10^8. |
random.state |
random state used. Default is 1. |
rounding |
The number of digits to round results. It must be a positive integer. Default value is 3. |
run.parallel |
Logical argument. Perform cross-validations in parallel? Default value is TRUE. |
n.jobs |
Number of cores to use in case of a parallel run. It must be less than or equal to the total number of cores on the host machine. If set to |
show.info |
If set to |
print.freq |
Iteration information printing frequency. It must be a positive integer. Default value is 10L. |
num.cv.folds |
The number of cross-validation folds when 'cv' is selected as |
num.cv.reps.eval |
The number of cross-validation repetitions for feature subset evaluation. It must be a positive integer. Default value is 3L. |
num.cv.reps.grad |
The number of cross-validation repetitions for gradient averaging. It must be a positive integer. Default value is 1L. |
num.grad.avg |
Number of gradients to average for gradient approximation. It must be a positive integer. Default value is 4L. |
perf.eval.method |
Performance evaluation method. It must be either 'cv' for cross-validation or 'resub' for resubstitution. Default is 'cv'. |
Value
spFSR
returns an object of class "spFSR". An object of class "spFSR" consists of the following:
task.spfs |
An mlr3 package |
wrapper |
An mlr3 package |
scoring |
An mlr3 package |
param best.model |
An mlr3 package |
iter.results |
A |
features |
Names of the best performing features. |
num.features |
The number of best performing features. |
importance |
A vector of importance ranks of the best performing features. |
total.iters |
The total number of iterations executed. |
best.iter |
The iteration where the best performing feature subset was encountered. |
best.value |
The best measure value encountered during execution. |
best.std |
The standard deviation corresponding to the best measure value encountered. |
run.time |
Total run time in minutes. |
results |
Dataframe with boolean of selected features, names and measure |
call |
Call. |
References
David V. Akman et al. (2022) k-best feature selection and ranking via stochastic approximation, Expert Systems with Applications, Vol. 213. See doi:10.1016/j.eswa.2022.118864
G.F.A Yeo and V. Aksakalli (2021) A stochastic approximation approach to simultaneous feature weighting and selection for nearest neighbour learners, Expert Systems with Applications, Vol. 185. See doi:10.1016/j.eswa.2021.115671