clusterTimeseries {segmenTier} | R Documentation |
Cluster a processed time-series with k-means.
Description
Performs kmeans
clustering of a
time-series object tset
provided by
processTimeseries
, and calculates cluster-cluster
and cluster-position similarity matrices as required for
segmentClusters
.
Usage
clusterTimeseries(tset, K = 16, iter.max = 1e+05, nstart = 100,
nui.thresh = -Inf, verb = 1)
Arguments
tset |
a "timeseries" object returned by
|
K |
the number of clusters to be calculated, ie. the argument
|
iter.max |
the maximum number of iterations allowed in
|
nstart |
number of randomized initializations of
|
nui.thresh |
threshold correlation of a data point to a cluster center; if below the data point will be added to nuisance cluster 0 |
verb |
level of verbosity, 0: no output, 1: progress messages |
Details
This function performs one or more time-series clustering(s)
using kmeans
, and the output of
processTimeseries
as input. It further calculates
cluster centers, cluster-cluster and cluster-position similarity
matrices (Pearson correlation) that will be used by the main function
of this package, segmentClusters
, to split the cluster
association sequence into segments, and assigns each segment to
the "winning" input cluster.
The argument K
is an integer vector that sets the requested
cluster numbers (argument centers
in
kmeans
). However, to avoid errors in batch
use, a smaller K
is chosen, if the data contains less then
K
distinct values.
Nuisance Cluster:
values that were removed during time-series processing, such as
rows that only contain 0 or NA values, will be assigned to
the "nuisance cluster" with cluster label "0". Additionally, a minimal
correlation to any cluster center can be specified, argument
nui.thresh
, and positions without any correlation higher
then this, will also be assigned to the "nuisance" cluster.
Resulting "nuisance segments" will not be shown in the results.
Cluster Sorting and Coloring:
additionally the cluster labels in the result object will be sorted by
cluster-cluster similarity (see sortClusters
) and cluster
colors assigned (see colorClusters
) for convenient data
inspection with the plot methods available for each data processing
step (see examples).
Note that the function, in conjunction with
processTimeseries
, can also be used as a stand-alone
tool for time-series clusterings, specifically implementing the
strategy of clustering the Discrete Fourier Transform of periodic
time-series developed by Machne & Murray (2012)
<doi:10.1371/journal.pone.0037906>, and further analyzed in Lehmann
et al. (2013) <doi:10.1186/1471-2105-14-133>, such as transcriptome
data from circadian or yeast respiratory oscillation systems.
Value
Returns a list of class "clustering" comprising of a matrix
of clusterings, lists of cluster centers, cluster-cluster and
cluster-position similarity matrices (Pearson correlation) used
by segmentClusters
, and additional information
such as a cluster sorting by similarity and cluster colors that
allow to track clusters in plots. A plot method exists that
allows to plot clusters aligned to "timeseries" and "segment"
plots.
References
Machne & Murray (2012) <doi:10.1371/journal.pone.0037906>, and Lehmann et al. (2013) <doi:10.1186/1471-2105-14-133>
Examples
data(primseg436)
## Discrete Fourier Transform of the time-series,
## see ?processTimeseries for details
tset <- processTimeseries(ts=tsd, na2zero=TRUE, use.fft=TRUE,
dft.range=1:7, dc.trafo="ash", use.snr=TRUE)
## ... and cluster the transformed time-series
cset <- clusterTimeseries(tset)
## plot methods for both returned objects allow aligned plots
par(mfcol=c(3,1))
plot(tset)
plot(cset)