partition {DepLogo} | R Documentation |
Paritions data by most inter-dependent positions
Description
Partitions data
by the nucleotides at the most inter-dependent
positions as measures by pairwise mutual information. Paritioning is
performed recursively on the resulting subsets until i) the number of
sequences in a partition is less then minElements
, ii) the average
pairwise dependency between the current position and numBestForSorting
other positions with the largest mutual information value drops below
threshold
, or iii) maxNum
recursive splits have already been
performed. If splitting results in smaller partitions than
minElements
, these are added to the smallest partition with more than
minElements
sequences.
Usage
partition(
data,
minElements = 10,
threshold = 0.1,
numBestForSorting = 3,
maxNum = 6,
sortByWeights = NULL,
partition.by = NULL
)
## S3 method for class 'DLData'
partition(
data,
minElements = 10,
threshold = 0.1,
numBestForSorting = 3,
maxNum = 6,
sortByWeights = NULL,
partition.by = NULL
)
Arguments
data |
the data as DLData object |
minElements |
the minimum number of elements to perform a further split |
threshold |
the threshold on the average mutual information value |
numBestForSorting |
the number of dependencies to other positions considered |
maxNum |
the maximum number of recursive splits |
sortByWeights |
if |
partition.by |
specify fixed positions to partition by |
Value
the partitions as list of DLData objects
Author(s)
Jan Grau <grau@informatik.uni-halle.de>
Examples
# create DLData object
seqs <- read.table(system.file("extdata", "cjun.txt", package = "DepLogo"),
stringsAsFactors = FALSE)
data <- DLData(sequences = seqs[, 1], weights = log1p(seqs[,2]) )
# partition data using default parameters
partitions <- partition(data)
# partition data using a threshold of 0.3 on the mutual
# information value to the most dependent position,
# sorting the resulting partitions by weight
partitions2 <- partition(data = data, threshold = 0.3, numBestForSorting = 1, sortByWeights = TRUE)