bibit3 {BiBitR} | R Documentation |
The BiBit Algorithm with Noise Allowance guided by Provided Patterns.
Description
Same function as bibit2
but only aims to discover biclusters containing the (sub) pattern of provided patterns or their combinations.
Usage
bibit3(matrix = NULL, minr = 1, minc = 2, noise = 0,
pattern_matrix = NULL, subpattern = TRUE, pattern_combinations = FALSE,
arff_row_col = NULL, extend_columns = "none", extend_mincol = 1,
extend_limitcol = 1, extend_noise = noise, extend_contained = FALSE)
Arguments
matrix |
The binary input matrix. |
minr |
The minimum number of rows of the Biclusters. (Note that in contrast to |
minc |
The minimum number of columns of the Biclusters. |
noise |
Noise parameter which determines the amount of zero's allowed in the bicluster (i.e. in the extra added rows to the starting row pair).
|
pattern_matrix |
Matrix (Number of Patterns x Number of Data Columns) containing the patterns of interest. |
subpattern |
Boolean value if sub patterns are of interest as well (default=TRUE). |
pattern_combinations |
Boolean value if the pairwise combinations of patterns (the intersecting 1's) should also used as starting points (default=FALSE). |
arff_row_col |
Same argument as in |
extend_columns |
Column Extension Parameter |
extend_mincol |
Column Extension Parameter |
extend_limitcol |
Column Extension Parameter |
extend_noise |
Column Extension Parameter |
extend_contained |
Column Extension Parameter |
Details
The goal of the bibit3
function is to provide one or multiple patterns in order to only find those biclusters exhibiting those patterns.
Multiple patterns can be given in matrix format, pattern_matrix
, and their pairwise combinations can automatically be added to this matrix by setting pattern_combinations=TRUE
.
All discovered biclusters are still subject to the provided noise
level.
Three types of Biclusters can be discovered:
- Full Pattern:
Bicluster which overlaps completely (within allowed noise levels) with the provided pattern. The column size of this bicluster is always equal to the number of 1's in the pattern.
- Sub Pattern:
Biclusters which overlap with a part of the provided pattern within allowed noise levels. Will only be given if
subpattern=TRUE
(default). Setting this option toFALSE
decreases computation time.- Extended:
Using the resulting biclusters from the full and sub patterns, other columns will be attempted to be added to the biclusters while keeping the noise as low as possible (the number of rows in the BC stays constant). This can be done either with
extend_columns
equal to"naive"
or"recursive"
. More info on the difference can be found in the Details Section ofbibit2
.
Naturally the articially added pattern rows will not be taken into account with the noise levels as they are 0 in each other column.
The question which is attempted to be answered here is 'Do the rows, which overlap partly or fully with the given pattern, have other similarities outside the given pattern?'
How?
The BiBit algorithm is applied to a data matrix that contains 2 identical artificial rows at the top which contain the given pattern.
The default algorithm is then slightly altered to only start from this articial row pair (=Full Pattern) or from 1 artificial row and 1 other row (=Sub Pattern).
Note 1 - Large Data:
The arff_row_col
can still be provided in case of large data matrices, but the .arff
file should already contain the pattern of interest in the first two rows. Consequently not more than 1 pattern at a time can be investigated with a single call of bibit3
.
Note 2 - Viewing Results:
A print
and summary
method has been implemented for the output object of bibit3
. It gives an overview of the amount of discovered biclusters and their dimensions
Additionally, the bibit3_patternBC
function can extract a Bicluster and add the artificial pattern rows to investigate the results.
Value
A S3 list object, "bibit3"
in which each element (apart from the last one) corresponds with a provided pattern or combination thereof.
Each element is a list containing:
Number
:Number of Initially found BC's by applying BiBit with the provided pattern.
Number_Extended
:Number of additional discovered BC's by extending the columns.
FullPattern
:Biclust S4 Class Object containing the Bicluster with the Full Pattern.
SubPattern
:Biclust S4 Class Object containing the Biclusters showing parts of the pattern.
Extended
:Biclust S4 Class Object containing the additional Biclusters after extending the biclusters (column wise) of the full and sub patterns
info
:Contains
Time_Min
element which includes the elapsed time of parts and the full analysis.
The last element in the list is a matrix containing all the investigated patterns.
Author(s)
Ewoud De Troyer
References
Domingo S. Rodriguez-Baena, Antonia J. Perez-Pulido and Jesus S. Aguilar-Ruiz (2011), "A biclustering algorithm for extracting bit-patterns from binary datasets", Bioinformatics
Examples
## Not run:
set.seed(1)
data <- matrix(sample(c(0,1),100*100,replace=TRUE,prob=c(0.9,0.1)),nrow=100,ncol=100)
data[1:10,1:10] <- 1 # BC1
data[11:20,11:20] <- 1 # BC2
data[21:30,21:30] <- 1 # BC3
colsel <- sample(1:ncol(data),ncol(data))
data <- data[sample(1:nrow(data),nrow(data)),colsel]
pattern_matrix <- matrix(0,nrow=3,ncol=100)
pattern_matrix[1,1:7] <- 1
pattern_matrix[2,11:15] <- 1
pattern_matrix[3,13:20] <- 1
pattern_matrix <- pattern_matrix[,colsel]
out <- bibit3(matrix=data,minr=2,minc=2,noise=0.1,pattern_matrix=pattern_matrix,
subpattern=TRUE,extend_columns=TRUE,pattern_combinations=TRUE)
out # OR print(out) OR summary(out)
bibit3_patternBC(result=out,matrix=data,pattern=c(1),type=c("full","sub","ext"),BC=c(1,2))
## End(Not run)