| partition {DepLogo} | R Documentation |
Paritions data by most inter-dependent positions
Description
Partitions data by the nucleotides at the most inter-dependent
positions as measures by pairwise mutual information. Paritioning is
performed recursively on the resulting subsets until i) the number of
sequences in a partition is less then minElements, ii) the average
pairwise dependency between the current position and numBestForSorting
other positions with the largest mutual information value drops below
threshold, or iii) maxNum recursive splits have already been
performed. If splitting results in smaller partitions than
minElements, these are added to the smallest partition with more than
minElements sequences.
Usage
partition(
data,
minElements = 10,
threshold = 0.1,
numBestForSorting = 3,
maxNum = 6,
sortByWeights = NULL,
partition.by = NULL
)
## S3 method for class 'DLData'
partition(
data,
minElements = 10,
threshold = 0.1,
numBestForSorting = 3,
maxNum = 6,
sortByWeights = NULL,
partition.by = NULL
)
Arguments
data |
the data as DLData object |
minElements |
the minimum number of elements to perform a further split |
threshold |
the threshold on the average mutual information value |
numBestForSorting |
the number of dependencies to other positions considered |
maxNum |
the maximum number of recursive splits |
sortByWeights |
if |
partition.by |
specify fixed positions to partition by |
Value
the partitions as list of DLData objects
Author(s)
Jan Grau <grau@informatik.uni-halle.de>
Examples
# create DLData object
seqs <- read.table(system.file("extdata", "cjun.txt", package = "DepLogo"),
stringsAsFactors = FALSE)
data <- DLData(sequences = seqs[, 1], weights = log1p(seqs[,2]) )
# partition data using default parameters
partitions <- partition(data)
# partition data using a threshold of 0.3 on the mutual
# information value to the most dependent position,
# sorting the resulting partitions by weight
partitions2 <- partition(data = data, threshold = 0.3, numBestForSorting = 1, sortByWeights = TRUE)