bibit2 {BiBitR}  R Documentation 
Same function as bibit
with an additional new noise parameter which allows 0's in the discovered biclusters (See Details for more info).
bibit2(matrix = NULL, minr = 2, minc = 2, noise = 0, arff_row_col = NULL, output_path = NULL, extend_columns = "none", extend_mincol = 1, extend_limitcol = 1, extend_noise = noise, extend_contained = FALSE)
matrix 
The binary input matrix. 
minr 
The minimum number of rows of the Biclusters. 
minc 
The minimum number of columns of the Biclusters. 
noise 
Noise parameter which determines the amount of zero's allowed in the bicluster (i.e. in the extra added rows to the starting row pair).

arff_row_col 
If you want to circumvent the internal R function to convert the matrix to 
output_path 
If as output, the original txt output of the Java code is desired, provide the outputh path here (without extension). In this case the 
extend_columns 
Column Extension Parameter 
extend_mincol 
Column Extension Parameter 
extend_limitcol 
Column Extension Parameter 
extend_noise 
Column Extension Parameter 
extend_contained 
Column Extension Parameter 
A Biclust S4 Class object.
bibit2
follows the same steps as described in the Details section of bibit
.
Following the general steps of the BiBit algorithm, the allowance for noise in the biclusters is inserted in the original algorithm as such:
Binary data is encoded in bit words.
Take a pair of rows as your starting point.
Find the maximal overlap of 1's between these two rows and save this as a pattern/motif. You now have a bicluster of 2 rows and N columns in which N is the number of 1's in the motif.
Check all remaining rows if they match this motif, however allow a specific amount of 0's in this matching as defined by the noise
parameter. Those rows that match completely or those within the allowed noise range are added to bicluster.
Go back to Step 2 and repeat for all possible row pairs.
Note: Biclusters are only saved if they satisfy the minr
and minc
parameter settings and if the bicluster is not already contained completely within another bicluster.
What you will end up with are biclusters not only consisting out of 1's, but biclusters in which 2 rows (the starting pair) are all 1's and in which the other rows could contain 0's (= noise).
Note: Because of the extra checks involved in the noise allowance, using noise might increase the computation time a little bit.
An optional procedure which can be applied after applying the BiBit algorithm (with noise) is called Column Extension.
The procedure will add extra columns to a BiBit bicluster, keeping into account the allowed extend_noise
level in each row.
The primary goal is to, after applying BiBit with noise, to also try and add some noise to the 2 initial 'perfect' rows.
Other parameters like extend_mincol
and extend_limitcol
can also further restrict which extensions should be discovered.
This procedure can be done either naively (fast) or recursively (more slow and thorough) with the extend_columns
parameter.
"naive"
Subsetting on the bicluster rows, the column candidates are ordered based on the most 1's in a column. Afterwards, in this order, each column is sequentially checked and added when the resulted BC is still within row noise levels.
This has 2 major consequences:
If 2 columns are identical, the first in the dataset is added, while the second isn't (depending on the noise level allowed per row).
If 2 nonidentical columns are viable to be added (correct row noise), the column with the most 1's is added. Afterwards the second column might not be viable anymore.
Note that using this method will always result in a maximum of 1 extended bicluster per original bicluster.
"recursive"
Conditioning the group of candidates for the allowed row noise level, each possible/allowed combination of adding columns to the bicluster is checked. Only the resulted biclusters with the highest number of extra columns are saved. Of course this could result in multiple extensions for 1 bicluster if there are multiple 'maximum added columns' results.
Note: These procedures are followed by a fast check if the extensions resulted in any duplicate biclusters. If so, these are deleted from the final result.
Ewoud De Troyer
Domingo S. RodriguezBaena, Antonia J. PerezPulido and Jesus S. AguilarRuiz (2011), "A biclustering algorithm for extracting bitpatterns from binary datasets", Bioinformatics
## Not run: data < matrix(sample(c(0,1),100*100,replace=TRUE,prob=c(0.9,0.1)),nrow=100,ncol=100) data[1:10,1:10] < 1 # BC1 data[11:20,11:20] < 1 # BC2 data[21:30,21:30] < 1 # BC3 data < data[sample(1:nrow(data),nrow(data)),sample(1:ncol(data),ncol(data))] result1 < bibit2(data,minr=5,minc=5,noise=0.2) result1 MaxBC(result1,top=1) result2 < bibit2(data,minr=5,minc=5,noise=3) result2 MaxBC(result2,top=2) ## End(Not run)