UfsCov_par {SFtools} | R Documentation |
UfsCov algorithm for unsupervised feature selection
Description
Applies the UfsCov algorithm based on the space filling concept, by using a sequatial forward search (SFS).This function offers a parellel computing.
Usage
UfsCov_par(data, ncores=2)
Arguments
data |
Data of class: |
ncores |
Number of cores to use (by default: |
Details
Since the algorithm is based on pairwise distances, and
according to the computing power of your machine, large number of
data points needs more memory. See UfsCov_ff
for memory
efficient storage of large data on disk and fast access (by using the
ff
and the ffbase
packages).
Value
A list of two elements:
-
CovD
a vector containing the coverage measure of each step of the SFS. -
IdR
a vector containing the added variables during the selection procedure.
Note
The algorithm does not deal with missing values and constant
features. Please make sure to remove them. Note that it is not recommanded to
use this function with small data, it takes more time than using the
standard UfsCov
function.
Author(s)
Mohamed Laib Mohamed.Laib@unil.ch
References
M. Laib and M. Kanevski (2017). Unsupervised Feature Selection Based on Space Filling Concept, arXiv:1706.08894.
See Also
Examples
N <- 800
dat<-Infinity(N)
Results<- UfsCov_par(dat,ncores=2)
cou<-colnames(dat)
nom<-cou[Results[[2]]]
par(mfrow=c(1,1), mar=c(5,5,2,2))
names(Results[[1]])<-cou[Results[[2]]]
plot(Results[[1]] ,pch=16,cex=1,col="blue", axes = FALSE,
xlab = "Added Features", ylab = "Coverage measure")
lines(Results[[1]] ,cex=2,col="blue")
grid(lwd=1.5,col="gray" )
box()
axis(2)
axis(1,1:length(nom),nom)
which.min(Results[[1]])
## Not run:
N<-5000
dat<-Infinity(N)
## Little comparison:
system.time(Uf<-UfsCov(dat))
system.time(Uf.p<-UfsCov_par(dat, ncores = 4))
## End(Not run)