rBMF-package {rBMF}R Documentation

Boolean Matrix Factorization

Description

Provides four boolean matrix factorization (BMF) methods. BMF has many applications like data mining and categorical data analysis. BMF is also known as boolean matrix decomposition (BMD) and was found to be an NP-hard (non-deterministic polynomial-time) problem. Currently implemented methods are 'Asso' Miettinen, Pauli and others (2008) <doi:10.1109/TKDE.2008.53>, 'GreConD' R. Belohlavek, V. Vychodil (2010) <doi:10.1016/j.jcss.2009.05.002> , 'GreConDPlus' R. Belohlavek, V. Vychodil (2010) <doi:10.1016/j.jcss.2009.05.002> , 'topFiberM' A. Desouki, M. Roeder, A. Ngonga (2019) <arXiv:1903.10326>.

Details

The DESCRIPTION file:

Package: rBMF
Type: Package
Title: Boolean Matrix Factorization
Version: 1.1
Date: 2021-1-13
Author: Abdelmoneim Amer Desouki
Maintainer: Abdelmoneim Amer Desouki <desouki@hhu.de>
Depends: R (>= 3.2.0), Matrix, methods, Rcpp
Description: Provides four boolean matrix factorization (BMF) methods. BMF has many applications like data mining and categorical data analysis. BMF is also known as boolean matrix decomposition (BMD) and was found to be an NP-hard (non-deterministic polynomial-time) problem. Currently implemented methods are 'Asso' Miettinen, Pauli and others (2008) <doi:10.1109/TKDE.2008.53>, 'GreConD' R. Belohlavek, V. Vychodil (2010) <doi:10.1016/j.jcss.2009.05.002> , 'GreConDPlus' R. Belohlavek, V. Vychodil (2010) <doi:10.1016/j.jcss.2009.05.002> , 'topFiberM' A. Desouki, M. Roeder, A. Ngonga (2019) <arXiv:1903.10326>.
License: GPL-3

Index of help topics:

Asso_approximate        Asso: Boolean Matrix Factorization
Chess                   Chess dataset
DBLP                    DBLP dataset
GreConD                 GreConD Boolean matrix factorization
GreConDPlus             GreConDPlus Boolean Matrix Factorization
rBMF-package            Boolean Matrix Factorization
topFiberM               topFiberM

Author(s)

Abdelmoneim Amer Desouki

References

topFiberM -Desouki, A. A., Röder, M., & Ngomo, A. C. N. (2019). topFiberM: Scalable and Efficient Boolean Matrix Factorization. arXiv preprint arXiv:1903.10326.

Asso -Miettinen, P., Mielikäinen, T., Gionis, A., Das, G., & Mannila, H. (2008). The discrete basis problem. IEEE transactions on knowledge and data engineering, 20(10), 1348-1362.

GreConD, GreConDPlus -Belohlavek R., Vychodil V.: Discovery of optimal factors in binary data via a novel method of matrix decomposition. Journal of Computer and System Sciences 76(1)(2010), 3-20

See Also

topFiberM Asso_approximate GreConD GreConDPlus

Examples


data(DBLP)
 X=DBLP
    r=7
     Xb=X==1#Convert to boolean
    tempX=as(X,'TsparseMatrix')
    stats=NULL
    for(tP in c(0.2,0.3,0.4,0.5,0.6,0.7,0.8,1)){
    
      Res=topFiberM(Xb,r=r,tP=tP,SR=100,verbose=1)
    
    X_=Res$A %*% Res$B
    X_=as(X_,'TsparseMatrix')
    #Calculate metrics
    li=tempX@i[tempX@x==1]+1
    lj=tempX@j[tempX@x==1]+1
    tp=sum(X_[cbind(li,lj)]>0)
    fn=sum(X)-tp#sum(!X_[cbind(li,lj)])
    fp=sum(X_@x>0)-tp
    cv=1-(fp+fn)/(tp+fn)
    stats=rbind(stats,cbind(tP,tp,fn,fp,cv,P=tp*1.0/(tp+fp),R=tp*1.0/(tp+fn)))
    }

   
    plot(stats[,'tP'],stats[,'R'],type='b',col='red',lwd=2,
    main=sprintf('topFiberM, dataset: %s, 
         #Known facts:%d','DBLP',sum(X)),ylab="",xlab='tP',
    xlim=c(0,1),ylim=c(0,1))
    HM=apply(stats,1,function(x){2/(1/x['P']+1/x['R'])})
    points(stats[,'tP'],stats[,'P'],col='blue',lwd=2,type='b')
    points(stats[,'tP'],HM,col='green',lwd=2,type='b')
    grid(nx=10, lty = "dotted", lwd = 2)
    legend(legend=c('Recall','Precision','Harmonic mean'),col=c('red','blue','green'),
    x=0.6,y=0.2,pch=1,cex=0.75,lwd=2)






[Package rBMF version 1.1 Index]