binarization {ctsfeatures}R Documentation

Constructs the binarized time series associated with a given categorical time series

Description

binarization constructs the binarized time series associated with a given categorical time series.

Usage

binarization(series)

Arguments

series

An object of type tsibble (see R package tsibble), whose column named Value contains the values of the corresponding CTS. This column must be of class factor and its levels must be determined by the range of the CTS.

Details

Given a CTS of length T with range \mathcal{V}=\{1, 2, \ldots, r\}, \overline{X}_t=\{\overline{X}_1,\ldots, \overline{X}_T\}, the function constructs the binarized time series, which is defined as \overline{\boldsymbol Y}_t=\{\overline{\boldsymbol Y}_1, \ldots, \overline{\boldsymbol Y}_T\}, with \overline{\boldsymbol Y}_k=(\overline{Y}_{k,1}, \ldots, \overline{Y}_{k,r})^\top such that \overline{Y}_{k,i}=1 if \overline{X}_k=i (k=1,\ldots,T, , i=1,\ldots,r). The binarized series is constructed in the form of a matrix whose rows represent time observations and whose columns represent the categories in the original series

Value

The binarized time series.

Author(s)

Ángel López-Oriona, José A. Vilar

References

López-Oriona Á, Vilar JA, D’Urso P (2023). “Hard and soft clustering of categorical time series based on two novel distances with an application to biological sequences.” Information Sciences, 624, 467–492.

Examples

sequence_1 <- GeneticSequences[which(GeneticSequences$Series==1),]
binarized_series <- binarization(sequence_1) # Constructing the binarized
# time series for the first CTS in dataset GeneticSequences

[Package ctsfeatures version 1.2.2 Index]