R: Clustering of sequences based on regular expression

cluster_reg_exp {biogram}

R Documentation

Clustering of sequences based on regular expression

Description

Clusters sequences hierarchically with regular expressions. At each step we minimize number of degrees of freedom for all regular expressions needed to describe the data

Usage

cluster_reg_exp(ngrams)

Arguments

ngrams

list of elements

Details

Regular expression is a list of the length equal to the length of the input sequences. Each element of the list represents a position in the sequence and contains amino acid, that are likely to occure on this position.

Value

List of four

"regExps"regular expression in best clustering
"seqClustering"clustering of sequences in best clustering
"allRegExps"all regular expressions.
"allIndices"all clusterings

Examples

data(human_cleave)
#cluster_reg_exp is computationally expensive

results <- cluster_reg_exp(human_cleave[1L:10, 1L:4])

[Package biogram version 1.6.3 Index]