rBMF-package {rBMF} | R Documentation |
Boolean Matrix Factorization
Description
Provides four boolean matrix factorization (BMF) methods. BMF has many applications like data mining and categorical data analysis. BMF is also known as boolean matrix decomposition (BMD) and was found to be an NP-hard (non-deterministic polynomial-time) problem. Currently implemented methods are 'Asso' Miettinen, Pauli and others (2008) <doi:10.1109/TKDE.2008.53>, 'GreConD' R. Belohlavek, V. Vychodil (2010) <doi:10.1016/j.jcss.2009.05.002> , 'GreConDPlus' R. Belohlavek, V. Vychodil (2010) <doi:10.1016/j.jcss.2009.05.002> , 'topFiberM' A. Desouki, M. Roeder, A. Ngonga (2019) <arXiv:1903.10326>.
Details
The DESCRIPTION file:
Package: | rBMF |
Type: | Package |
Title: | Boolean Matrix Factorization |
Version: | 1.1 |
Date: | 2021-1-13 |
Author: | Abdelmoneim Amer Desouki |
Maintainer: | Abdelmoneim Amer Desouki <desouki@hhu.de> |
Depends: | R (>= 3.2.0), Matrix, methods, Rcpp |
Description: | Provides four boolean matrix factorization (BMF) methods. BMF has many applications like data mining and categorical data analysis. BMF is also known as boolean matrix decomposition (BMD) and was found to be an NP-hard (non-deterministic polynomial-time) problem. Currently implemented methods are 'Asso' Miettinen, Pauli and others (2008) <doi:10.1109/TKDE.2008.53>, 'GreConD' R. Belohlavek, V. Vychodil (2010) <doi:10.1016/j.jcss.2009.05.002> , 'GreConDPlus' R. Belohlavek, V. Vychodil (2010) <doi:10.1016/j.jcss.2009.05.002> , 'topFiberM' A. Desouki, M. Roeder, A. Ngonga (2019) <arXiv:1903.10326>. |
License: | GPL-3 |
Index of help topics:
Asso_approximate Asso: Boolean Matrix Factorization Chess Chess dataset DBLP DBLP dataset GreConD GreConD Boolean matrix factorization GreConDPlus GreConDPlus Boolean Matrix Factorization rBMF-package Boolean Matrix Factorization topFiberM topFiberM
Author(s)
Abdelmoneim Amer Desouki
References
topFiberM -Desouki, A. A., Röder, M., & Ngomo, A. C. N. (2019). topFiberM: Scalable and Efficient Boolean Matrix Factorization. arXiv preprint arXiv:1903.10326.
Asso -Miettinen, P., Mielikäinen, T., Gionis, A., Das, G., & Mannila, H. (2008). The discrete basis problem. IEEE transactions on knowledge and data engineering, 20(10), 1348-1362.
GreConD, GreConDPlus -Belohlavek R., Vychodil V.: Discovery of optimal factors in binary data via a novel method of matrix decomposition. Journal of Computer and System Sciences 76(1)(2010), 3-20
See Also
topFiberM
Asso_approximate
GreConD
GreConDPlus
Examples
data(DBLP)
X=DBLP
r=7
Xb=X==1#Convert to boolean
tempX=as(X,'TsparseMatrix')
stats=NULL
for(tP in c(0.2,0.3,0.4,0.5,0.6,0.7,0.8,1)){
Res=topFiberM(Xb,r=r,tP=tP,SR=100,verbose=1)
X_=Res$A %*% Res$B
X_=as(X_,'TsparseMatrix')
#Calculate metrics
li=tempX@i[tempX@x==1]+1
lj=tempX@j[tempX@x==1]+1
tp=sum(X_[cbind(li,lj)]>0)
fn=sum(X)-tp#sum(!X_[cbind(li,lj)])
fp=sum(X_@x>0)-tp
cv=1-(fp+fn)/(tp+fn)
stats=rbind(stats,cbind(tP,tp,fn,fp,cv,P=tp*1.0/(tp+fp),R=tp*1.0/(tp+fn)))
}
plot(stats[,'tP'],stats[,'R'],type='b',col='red',lwd=2,
main=sprintf('topFiberM, dataset: %s,
#Known facts:%d','DBLP',sum(X)),ylab="",xlab='tP',
xlim=c(0,1),ylim=c(0,1))
HM=apply(stats,1,function(x){2/(1/x['P']+1/x['R'])})
points(stats[,'tP'],stats[,'P'],col='blue',lwd=2,type='b')
points(stats[,'tP'],HM,col='green',lwd=2,type='b')
grid(nx=10, lty = "dotted", lwd = 2)
legend(legend=c('Recall','Precision','Harmonic mean'),col=c('red','blue','green'),
x=0.6,y=0.2,pch=1,cex=0.75,lwd=2)