UfsCov {SFtools} | R Documentation |
UfsCov algorithm for unsupervised feature selection
Description
Applies the UfsCov algorithm based on the space filling concept, by using a sequatial forward search (SFS).
Usage
UfsCov(data)
Arguments
data |
Data of class: |
Details
Since the algorithm is based on pairwise distances, and
according to the computing power of your machine, large number of
data points can take much time and needs more memory.
See UfsCov_par
for parellel computing, or
UfsCov_ff
for memory efficient storage of large data
on disk and fast access (by using the ff
and the ffbase
packages).
Value
A list of two elements:
-
CovD
a vector containing the coverage measure of each step of the SFS. -
IdR
a vector containing the added variables during the selection procedure.
Note
The algorithm does not deal with missing values and constant features. Please make sure to remove them.
Author(s)
Mohamed Laib Mohamed.Laib@unil.ch
References
M. Laib and M. Kanevski (2017). Unsupervised Feature Selection Based on Space Filling Concept, arXiv:1706.08894.
Examples
infinity<-Infinity(n=800)
Results<- UfsCov(infinity)
cou<-colnames(infinity)
nom<-cou[Results[[2]]]
par(mfrow=c(1,1), mar=c(5,5,2,2))
names(Results[[1]])<-cou[Results[[2]]]
plot(Results[[1]] ,pch=16,cex=1,col="blue", axes = FALSE,
xlab = "Added Features", ylab = "Coverage measure")
lines(Results[[1]] ,cex=2,col="blue")
grid(lwd=1.5,col="gray" )
box()
axis(2)
axis(1,1:length(nom),nom)
which.min(Results[[1]])
## Not run:
#### UfsCov on the Butterfly dataset ####
require(IDmining)
N <- 1000
raw_dat <- Butterfly(N)
dat<-raw_dat[,-9]
Results<- UfsCov(dat)
cou<-colnames(dat)
nom<-cou[Results[[2]]]
par(mfrow=c(1,1), mar=c(5,5,2,2))
names(Results[[1]])<-cou[Results[[2]]]
plot(Results[[1]] ,pch=16,cex=1,col="blue", axes = FALSE,
xlab = "Added Features", ylab = "Coverage measure")
lines(Results[[1]] ,cex=2,col="blue")
grid(lwd=1.5,col="gray" )
box()
axis(2)
axis(1,1:length(nom),nom)
which.min(Results[[1]])
## End(Not run)