tsclusters-methods {dtwclust} | R Documentation |
Methods for TSClusters
Description
Methods associated with TSClusters and derived objects.
Usage
## S4 method for signature 'TSClusters'
initialize(.Object, ..., override.family = TRUE)
## S4 method for signature 'TSClusters'
show(object)
## S3 method for class 'TSClusters'
update(object, ..., evaluate = TRUE)
## S4 method for signature 'TSClusters'
update(object, ..., evaluate = TRUE)
## S3 method for class 'TSClusters'
predict(object, newdata = NULL, ...)
## S4 method for signature 'TSClusters'
predict(object, newdata = NULL, ...)
## S3 method for class 'TSClusters'
plot(
x,
y,
...,
clus = seq_len(x@k),
labs.arg = NULL,
series = NULL,
time = NULL,
plot = TRUE,
type = NULL,
labels = NULL
)
## S4 method for signature 'TSClusters,missing'
plot(
x,
y,
...,
clus = seq_len(x@k),
labs.arg = NULL,
series = NULL,
time = NULL,
plot = TRUE,
type = NULL,
labels = NULL
)
Arguments
.Object |
A |
... |
For |
override.family |
Logical. Attempt to substitute the default family with one that conforms to the provided elements? See Initialize section. |
object , x |
An object that inherits from TSClusters as returned by |
evaluate |
Logical. Defaults to |
newdata |
New data to be assigned to a cluster. It can take any of the supported formats of
|
y |
Ignored. |
clus |
A numeric vector indicating which clusters to plot. |
labs.arg |
A list with arguments to change the title and/or axis labels. See the examples
and |
series |
Optionally, the data in the same format as it was provided to |
time |
Optional values for the time axis. If series have different lengths, provide the time values of the longest series. |
plot |
Logical flag. You can set this to |
type |
What to plot. |
labels |
Whether to include labels in the plot (not for dendrogram plots). See details and note that this is subject to randomness. |
Details
The update
method takes the original function call, replaces any provided argument and
optionally evaluates the call again. Use evaluate = FALSE
if you want to get the unevaluated
call. If no arguments are provided, the object is updated to a new version if necessary (this is
due to changes in the internal functions of the package, here for backward compatibility).
Value
The plot method returns a gg
object (or NULL
for dendrogram plot) invisibly.
Initialize
The initialize method is used when calling methods::new()
. The family
slot can be
substituted with an appropriate one if certain elements are provided by the user. The
initialize methods of derived classes also inherit the family and can use it to calculate other
slots. In order to get a fully functional object, at least the following slots should be
provided:
-
type
: "partitional", "hierarchical", "fuzzy" or "tadpole". -
datalist
: The data in one of the supported formats. -
centroids
: The time series centroids in one of the supported formats. -
cluster
: The cluster indices for each series in thedatalist
. -
control*
: A tsclust-controls object with the desired parameters. -
distance*
: A string indicating the distance that should be used. -
centroid*
: A string indicating the centroid to use (only necessary for partitional clustering).
*Necessary when overriding the default family for the calculation of other slots, CVIs or prediction. Maybe not always needed, e.g. for plotting.
Prediction
The predict
generic can take the usual newdata
argument. If NULL
, the method simply
returns the obtained cluster indices. Otherwise, a nearest-neighbor classification based on the
centroids obtained from clustering is performed:
-
newdata
is preprocessed withobject@family@preproc
using the parameters inobject@args$preproc
. A cross-distance matrix between the processed series and
object@centroids
is computed withobject@family@dist
using the parameters inobject@args$dist
.For non-fuzzy clustering, the series are assigned to their nearest centroid's cluster. For fuzzy clustering, the fuzzy membership matrix for the series is calculated. In both cases, the function in
object@family@cluster
is used.
Plotting
The plot method uses the ggplot2
plotting system (see ggplot2::ggplot()
).
The default depends on whether a hierarchical method was used or not. In those cases, the
dendrogram is plotted by default; you can pass any extra parameters to stats::plot.hclust()
via the ellipsis (...
).
Otherwise, the function plots the time series of each cluster along with the obtained centroid.
The default values for cluster centroids are: linetype = "dashed"
, linewidth = 1.5
,
colour = "black"
, alpha = 0.5
. You can change this by means of the ellipsis (...
).
You can choose what to plot with the type
parameter. Possible options are:
-
"dendrogram"
: Only available for hierarchical clustering. -
"series"
: Plot the time series divided into clusters without including centroids. -
"centroids"
: Plot the obtained centroids only. -
"sc"
: Plot both series and centroids
In order to enable labels on the (non-dendrogram) plot, you have to select an option that plots
the series and at least provide an empty list in the labels
argument. This list can contain
arguments for ggrepel::geom_label_repel()
and will be passed along. The following are
set by the plot method if they are not provided:
-
"mapping"
: set to aes(x = t, y = value, label = label) -
"data"
: a data frame with as many rows as series in thedatalist
and 4 columns:-
t
: x coordinate of the label for each series. -
value
: y coordinate of the label for each series. -
cl
: index of the cluster to which the series belongs (i.e.x@cluster
). -
label
: the label for the given series (i.e.names(x@datalist)
).
-
You can provide your own data frame if you want, but it must have those columns and, even if
you override mapping
, the cl
column must have that name. The method will attempt to spread
the labels across the plot, but note that this is subject to randomness, so be careful if
you need reproducibility of any commands used after plotting (see examples).
If created, the function returns the gg
object invisibly, in case you want to modify it to
your liking. You might want to look at ggplot2::ggplot_build()
if that's the case.
If you want to free the scale of the X axis, you can do the following:
plot(x, plot = FALSE)
+
facet_wrap(~cl, scales = "free")
For more complicated changes, you're better off looking at the source code at https://github.com/asardaes/dtwclust/blob/master/R/S4-TSClusters-methods.R and creating your own plotting function.
Examples
data(uciCT)
# Assuming this was generated by some clustering procedure
centroids <- CharTraj[seq(1L, 100L, 5L)]
cluster <- unclass(CharTrajLabels)
pc_obj <- new("PartitionalTSClusters",
type = "partitional", datalist = CharTraj,
centroids = centroids, cluster = cluster,
distance = "sbd", centroid = "dba",
control = partitional_control(),
args = tsclust_args(cent = list(window.size = 8L, norm = "L2")))
fc_obj <- new("FuzzyTSClusters",
type = "fuzzy", datalist = CharTraj,
centroids = centroids, cluster = cluster,
distance = "sbd", centroid = "fcm",
control = fuzzy_control())
show(fc_obj)
## Not run:
plot(pc_obj, type = "c", linetype = "solid",
labs.arg = list(title = "Clusters' centroids"))
set.seed(15L)
plot(pc_obj, labels = list(nudge_x = -5, nudge_y = 0.2),
clus = c(1L,4L))
## End(Not run)