bibit_columnextension {BiBitR} | R Documentation |
Column Extension Procedure
Description
Function which accepts result from bibit
, bibit2
or bibit3
and will (re-)apply the column extension procedure. This means if the result already contained extended biclusters that these will be deleted.
Usage
bibit_columnextension(result, matrix, arff_row_col = NULL, BC = NULL,
extend_columns = "naive", extend_mincol = 1, extend_limitcol = 1,
extend_noise = 1, extend_contained = FALSE)
Arguments
result |
|
matrix |
The binary input matrix. |
arff_row_col |
The same file directories (with the same limitations) as given in |
BC |
A numeric/integer vector of BC's which should be extended. Different behaviour for the 3 types of input results:
|
extend_columns |
Column Extension Parameter |
extend_mincol |
Column Extension Parameter |
extend_limitcol |
Column Extension Parameter |
extend_noise |
Column Extension Parameter |
extend_contained |
Column Extension Parameter |
Value
A Biclust S4 Class object or bibit3 S3 list Class object
Details - Column Extension
An optional procedure which can be applied after applying the BiBit algorithm (with noise) is called Column Extension.
The procedure will add extra columns to a BiBit bicluster, keeping into account the allowed extend_noise
level in each row.
The primary goal is to, after applying BiBit with noise, to also try and add some noise to the 2 initial 'perfect' rows.
Other parameters like extend_mincol
and extend_limitcol
can also further restrict which extensions should be discovered.
This procedure can be done either naively (fast) or recursively (more slow and thorough) with the extend_columns
parameter.
"naive"
Subsetting on the bicluster rows, the column candidates are ordered based on the most 1's in a column. Afterwards, in this order, each column is sequentially checked and added when the resulted BC is still within row noise levels.
This has 2 major consequences:If 2 columns are identical, the first in the dataset is added, while the second isn't (depending on the noise level allowed per row).
If 2 non-identical columns are viable to be added (correct row noise), the column with the most 1's is added. Afterwards the second column might not be viable anymore.
Note that using this method will always result in a maximum of 1 extended bicluster per original bicluster.
"recursive"
-
Conditioning the group of candidates for the allowed row noise level, each possible/allowed combination of adding columns to the bicluster is checked. Only the resulted biclusters with the highest number of extra columns are saved. Of course this could result in multiple extensions for 1 bicluster if there are multiple 'maximum added columns' results.
Note: These procedures are followed by a fast check if the extensions resulted in any duplicate biclusters. If so, these are deleted from the final result.
Author(s)
Ewoud De Troyer
Examples
## Not run:
set.seed(1)
data <- matrix(sample(c(0,1),100*100,replace=TRUE,prob=c(0.9,0.1)),nrow=100,ncol=100)
data[1:10,1:10] <- 1 # BC1
data[11:20,11:20] <- 1 # BC2
data[21:30,21:30] <- 1 # BC3
data <- data[sample(1:nrow(data),nrow(data)),sample(1:ncol(data),ncol(data))]
result <- bibit2(data,minr=5,minc=5,noise=0.1,extend_columns = "recursive",
extend_mincol=1,extend_limitcol=1)
result
result2 <- bibit_columnextension(result=out,matrix=data,arff_row_col=NULL,BC=c(1,10),
extend_columns="recursive",extend_mincol=1,
extend_limitcol=1,extend_noise=2,extend_contained=FALSE)
result2
## End(Not run)