naiveWrapper {rFerns} | R Documentation |
Naive feature selection method utilising the rFerns shadow imporance
Description
Proof-of-concept ensemble of rFerns models, built to stabilise and improve selection based on shadow importance.
It employs a super-ensemble of iterations
small rFerns forests, each built on a subspace of size
attributes, which is selected randomly, but with a higher selection probability for attributes claimed important by previous sub-models.
Final selection is a group of attributes which hold a substantial weight at the end of the procedure.
Usage
naiveWrapper(
x,
y,
iterations = 1000,
depth = 5,
ferns = 100,
size = 30,
lambda = 5,
threads = 0,
saveHistory = FALSE
)
Arguments
x |
Data frame containing attributes; must have unique names and contain only numeric, integer or (ordered) factor columns.
Factors must have less than 31 levels. No |
y |
A decision vector. Must a factor of the same length as |
iterations |
Number of iterations i.e., the number of sub-models built. |
depth |
The depth of the ferns; must be in 1–16 range. Note that time and memory requirements scale with |
ferns |
Number of ferns to be build in each sub-model. This should be a small number, around 3-5 times |
size |
Number of attributes considered by each sub-model. |
lambda |
Lambda parameter driving the re-weighting step of the method. |
threads |
Number of parallel threads, copied to the underlying |
saveHistory |
Should weight history be stored. |
Value
An object of class naiveWrapper
, which is a list with the following components:
found |
Names of all selected attributes. |
weights |
Vector of weights indicating the confidence that certain feature is relevant. |
timeTaken |
Time of computation. |
weightHistory |
History of weights over all iterations, present if |
params |
Copies of algorithm parameters, |
References
Kursa MB (2017). Efficient all relevant feature selection with random ferns. In: Kryszkiewicz M., Appice A., Slezak D., Rybinski H., Skowron A., Ras Z. (eds) Foundations of Intelligent Systems. ISMIS 2017. Lecture Notes in Computer Science, vol 10352. Springer, Cham.
Examples
set.seed(77)
#Fetch Iris data
data(iris)
#Extend with random noise
noisyIris<-cbind(iris[,-5],apply(iris[,-5],2,sample))
names(noisyIris)[5:8]<-sprintf("Nonsense%d",1:4)
#Execute selection
naiveWrapper(noisyIris,iris$Species,iterations=50,ferns=20,size=8)