| clusterTimeseries {segmenTier} | R Documentation |
Cluster a processed time-series with k-means.
Description
Performs kmeans clustering of a
time-series object tset provided by
processTimeseries, and calculates cluster-cluster
and cluster-position similarity matrices as required for
segmentClusters.
Usage
clusterTimeseries(tset, K = 16, iter.max = 1e+05, nstart = 100,
nui.thresh = -Inf, verb = 1)
Arguments
tset |
a "timeseries" object returned by
|
K |
the number of clusters to be calculated, ie. the argument
|
iter.max |
the maximum number of iterations allowed in
|
nstart |
number of randomized initializations of
|
nui.thresh |
threshold correlation of a data point to a cluster center; if below the data point will be added to nuisance cluster 0 |
verb |
level of verbosity, 0: no output, 1: progress messages |
Details
This function performs one or more time-series clustering(s)
using kmeans, and the output of
processTimeseries as input. It further calculates
cluster centers, cluster-cluster and cluster-position similarity
matrices (Pearson correlation) that will be used by the main function
of this package, segmentClusters, to split the cluster
association sequence into segments, and assigns each segment to
the "winning" input cluster.
The argument K is an integer vector that sets the requested
cluster numbers (argument centers in
kmeans). However, to avoid errors in batch
use, a smaller K is chosen, if the data contains less then
K distinct values.
Nuisance Cluster:
values that were removed during time-series processing, such as
rows that only contain 0 or NA values, will be assigned to
the "nuisance cluster" with cluster label "0". Additionally, a minimal
correlation to any cluster center can be specified, argument
nui.thresh, and positions without any correlation higher
then this, will also be assigned to the "nuisance" cluster.
Resulting "nuisance segments" will not be shown in the results.
Cluster Sorting and Coloring:
additionally the cluster labels in the result object will be sorted by
cluster-cluster similarity (see sortClusters) and cluster
colors assigned (see colorClusters) for convenient data
inspection with the plot methods available for each data processing
step (see examples).
Note that the function, in conjunction with
processTimeseries, can also be used as a stand-alone
tool for time-series clusterings, specifically implementing the
strategy of clustering the Discrete Fourier Transform of periodic
time-series developed by Machne & Murray (2012)
<doi:10.1371/journal.pone.0037906>, and further analyzed in Lehmann
et al. (2013) <doi:10.1186/1471-2105-14-133>, such as transcriptome
data from circadian or yeast respiratory oscillation systems.
Value
Returns a list of class "clustering" comprising of a matrix
of clusterings, lists of cluster centers, cluster-cluster and
cluster-position similarity matrices (Pearson correlation) used
by segmentClusters, and additional information
such as a cluster sorting by similarity and cluster colors that
allow to track clusters in plots. A plot method exists that
allows to plot clusters aligned to "timeseries" and "segment"
plots.
References
Machne & Murray (2012) <doi:10.1371/journal.pone.0037906>, and Lehmann et al. (2013) <doi:10.1186/1471-2105-14-133>
Examples
data(primseg436)
## Discrete Fourier Transform of the time-series,
## see ?processTimeseries for details
tset <- processTimeseries(ts=tsd, na2zero=TRUE, use.fft=TRUE,
dft.range=1:7, dc.trafo="ash", use.snr=TRUE)
## ... and cluster the transformed time-series
cset <- clusterTimeseries(tset)
## plot methods for both returned objects allow aligned plots
par(mfcol=c(3,1))
plot(tset)
plot(cset)