R: topFiberM

topFiberM {rBMF}

R Documentation

topFiberM

Description

implements topFiberM Boolean matrix factorization algorithm. topFiberM chooses in a greedy way the fibers (rows or columns) to represent the entire matrix. Fibers are extended to rectangles according to a threshold on precision. The search for these" top fibers" can continue beyond the required rank and according to an optional parameter that defines the limit for this search.

Usage

topFiberM(X, r = 2, tP = 0.5, verbose = 2, SR = NULL)

Arguments

`X`	the input boolean sparse matrix
`r`	rank (number of factors) required.
`tP`	parameter to put threshold on precision
`verbose`	integer value to control the appearance of messages. 0 minimal messages will be showed. Default 2
`SR`	search limit which defines the number iterations, minimum value is rank and maximum value is minimum number of columns and number of rows

Value

List of the following four components:

`A`	Factor matrix A
`B`	Factor matrix B
`X1`	remaining uncovered ones, (False negatives)
`tf`	dataframe logging of steps giving description of each factor, contains index, based on column (2) / row (1), nnz, TP, FP

Author(s)

Abdelmoneim Amer Desouki

References

Desouki, A. A., Roeder, M., & Ngomo, A. C. N. (2019). topFiberM: Scalable and Efficient Boolean Matrix Factorization. arXiv preprint arXiv:1903.10326.

Examples


data(DBLP)
    r=7
    tP=0.6
    X=DBLP
    Xb=X==1#Convert to boolean
    
 Res=topFiberM(Xb,r=r,tP=tP,SR=100,verbose=1)
    X_=Res$A %*% Res$B
    X_=as(X_,'TsparseMatrix')
    #Calculate metrics
    tempX=as(X,'TsparseMatrix')
    li=tempX@i[tempX@x==1]+1
    lj=tempX@j[tempX@x==1]+1
    tp=sum(X_[cbind(li,lj)]>0)
    fn=sum(X)-tp#sum(!X_[cbind(li,lj)])
    fp=sum(X_@x>0)-tp
    cv=1-(fp+fn)/(tp+fn)
    
print(sprintf("tp:%d, fp:%d,fn:%d, Error:%d, covered=%.3f",tp,fp,fn,fn+fp,cv))

[Package rBMF version 1.1 Index]