bibit3 {BiBitR}  R Documentation 
Same function as bibit2
but only aims to discover biclusters containing the (sub) pattern of provided patterns or their combinations.
bibit3(matrix = NULL, minr = 1, minc = 2, noise = 0, pattern_matrix = NULL, subpattern = TRUE, pattern_combinations = FALSE, arff_row_col = NULL, extend_columns = "none", extend_mincol = 1, extend_limitcol = 1, extend_noise = noise, extend_contained = FALSE)
matrix 
The binary input matrix. 
minr 
The minimum number of rows of the Biclusters. (Note that in contrast to 
minc 
The minimum number of columns of the Biclusters. 
noise 
Noise parameter which determines the amount of zero's allowed in the bicluster (i.e. in the extra added rows to the starting row pair).

pattern_matrix 
Matrix (Number of Patterns x Number of Data Columns) containing the patterns of interest. 
subpattern 
Boolean value if sub patterns are of interest as well (default=TRUE). 
pattern_combinations 
Boolean value if the pairwise combinations of patterns (the intersecting 1's) should also used as starting points (default=FALSE). 
arff_row_col 
Same argument as in 
extend_columns 
Column Extension Parameter 
extend_mincol 
Column Extension Parameter 
extend_limitcol 
Column Extension Parameter 
extend_noise 
Column Extension Parameter 
extend_contained 
Column Extension Parameter 
The goal of the bibit3
function is to provide one or multiple patterns in order to only find those biclusters exhibiting those patterns.
Multiple patterns can be given in matrix format, pattern_matrix
, and their pairwise combinations can automatically be added to this matrix by setting pattern_combinations=TRUE
.
All discovered biclusters are still subject to the provided noise
level.
Three types of Biclusters can be discovered:
Bicluster which overlaps completely (within allowed noise levels) with the provided pattern. The column size of this bicluster is always equal to the number of 1's in the pattern.
Biclusters which overlap with a part of the provided pattern within allowed noise levels. Will only be given if subpattern=TRUE
(default). Setting this option to FALSE
decreases computation time.
Using the resulting biclusters from the full and sub patterns, other columns will be attempted to be added to the biclusters while keeping the noise as low as possible (the number of rows in the BC stays constant).
This can be done either with extend_columns
equal to "naive"
or "recursive"
. More info on the difference can be found in the Details Section of bibit2
.
Naturally the articially added pattern rows will not be taken into account with the noise levels as they are 0 in each other column.
The question which is attempted to be answered here is 'Do the rows, which overlap partly or fully with the given pattern, have other similarities outside the given pattern?'
How?
The BiBit algorithm is applied to a data matrix that contains 2 identical artificial rows at the top which contain the given pattern.
The default algorithm is then slightly altered to only start from this articial row pair (=Full Pattern) or from 1 artificial row and 1 other row (=Sub Pattern).
Note 1  Large Data:
The arff_row_col
can still be provided in case of large data matrices, but the .arff
file should already contain the pattern of interest in the first two rows. Consequently not more than 1 pattern at a time can be investigated with a single call of bibit3
.
Note 2  Viewing Results:
A print
and summary
method has been implemented for the output object of bibit3
. It gives an overview of the amount of discovered biclusters and their dimensions
Additionally, the bibit3_patternBC
function can extract a Bicluster and add the artificial pattern rows to investigate the results.
A S3 list object, "bibit3"
in which each element (apart from the last one) corresponds with a provided pattern or combination thereof.
Each element is a list containing:
Number
: Number of Initially found BC's by applying BiBit with the provided pattern.
Number_Extended
: Number of additional discovered BC's by extending the columns.
FullPattern
: Biclust S4 Class Object containing the Bicluster with the Full Pattern.
SubPattern
: Biclust S4 Class Object containing the Biclusters showing parts of the pattern.
Extended
: Biclust S4 Class Object containing the additional Biclusters after extending the biclusters (column wise) of the full and sub patterns
info
: Contains Time_Min
element which includes the elapsed time of parts and the full analysis.
The last element in the list is a matrix containing all the investigated patterns.
Ewoud De Troyer
Domingo S. RodriguezBaena, Antonia J. PerezPulido and Jesus S. AguilarRuiz (2011), "A biclustering algorithm for extracting bitpatterns from binary datasets", Bioinformatics
## Not run: set.seed(1) data < matrix(sample(c(0,1),100*100,replace=TRUE,prob=c(0.9,0.1)),nrow=100,ncol=100) data[1:10,1:10] < 1 # BC1 data[11:20,11:20] < 1 # BC2 data[21:30,21:30] < 1 # BC3 colsel < sample(1:ncol(data),ncol(data)) data < data[sample(1:nrow(data),nrow(data)),colsel] pattern_matrix < matrix(0,nrow=3,ncol=100) pattern_matrix[1,1:7] < 1 pattern_matrix[2,11:15] < 1 pattern_matrix[3,13:20] < 1 pattern_matrix < pattern_matrix[,colsel] out < bibit3(matrix=data,minr=2,minc=2,noise=0.1,pattern_matrix=pattern_matrix, subpattern=TRUE,extend_columns=TRUE,pattern_combinations=TRUE) out # OR print(out) OR summary(out) bibit3_patternBC(result=out,matrix=data,pattern=c(1),type=c("full","sub","ext"),BC=c(1,2)) ## End(Not run)