calculateScore {segmenTier} | R Documentation |
segmenTier's core dynamic programming routine in Rcpp
Description
segmenTier's core dynamic programming routine in Rcpp
Usage
calculateScore(seq, C, score, csim, M, Mn, multi = "max")
Arguments
seq |
the cluster sequence (where clusters at positions k:i are considered). Note, that unlike the R wrapper, clustering numbers here are 0-based, where 0 is the nuisance cluster. |
C |
the list of clusters, including nuisance cluster '0', see
|
score |
the scoring function to be used, one of "ccor" or "icor",
an apt similarity matrix must be supplied via option |
csim |
a matrix, providing either the cluster-cluster (scoring function "ccor") or the position-cluster similarity function (scoring function "icor") |
M |
minimal sequence length; Note, that this is not a strict cut-off but defined as an accumulating penalty that must be "overcome" by good score |
Mn |
minimal sequence length for nuisance cluster, Mn<M will allow shorter distances between segments |
multi |
if multiple |
Details
This is segmenTier
's core dynamic programming
routine. It constructs the total score matrix S(i,c), based on
the passed scoring function ("icor" or "ccor"), and length penalty
M
. "Nuisance" cluster "0" can have a smaller penalty Mn
to allow for shorter distances between "real" segments.
Scoring function "icor" calculates the sum of similarities of data at positions k:i to cluster centers c over all k and i. The similarities are calculated e.g., as a (Pearson) correlation between the data at individual positions and the tested cluster c center.
Scoring function "ccor" calculates the sum of similarities between the clusters at positions k:i to cluster c over all k and i.
Scoring function "ccls" is a special case of "ccor" and is NOT handled
here, but is reflected in the cluster similarity matrix csim
. It
is handled and automatically constructed in the R wrapper
segmentClusters
, and merely counts the
number of clusters in sequence k:i, over all k and i, that are identical
to the tested cluster c
, and sub-tracts
a penalty for the count of non-identical clusters.
Value
Returns the total score matrix S(i,c)
and the matrix
K(i,c)
which stores the position k
which delivered
the maximal score at position i
. This is used in the back-tracing
phase.
References
Machne, Murray & Stadler (2017) <doi:10.1038/s41598-017-12401-8>