R: Computes several subfeatures associated with a categorical...

calculate_subfeatures {ctsfeatures}

R Documentation

Computes several subfeatures associated with a categorical time series

Description

calculate_features computes several subfeatures associated with a categorical time series or between a categorical and a real-valued time series

Usage

calculate_subfeatures(series, n_series, lag = 1, type = NULL)

Arguments

`series`	An object of type `tsibble` (see R package `tsibble`), whose column named Value contains the values of the corresponding CTS. This column must be of class `factor` and its levels must be determined by the range of the CTS.
`n_series`	A real-valued time series.
`lag`	The considered lag (default is 1).
`type`	String indicating the subfeature one wishes to compute.

Details

Assume we have a CTS of length T with range \mathcal{V}=\{1, 2, \ldots, r\}, \overline{X}_t=\{\overline{X}_1,\ldots, \overline{X}_T\}, with \widehat{p}_i being the natural estimate of the marginal probability of the ith category, and \widehat{p}_{ij}(l) being the natural estimate of the joint probability for categories i and j at lag l, i,j=1, \ldots, r. Assume also that we have a real-valued time series of length T, \overline{Z}_t=\{\overline{Z}_1,\ldots, \overline{Z}_T\}. The function computes the following subfeatures depending on the argument type:

If type=entropy, the function computes the subfeatures associated with the estimated entropy, \widehat{p}_i\ln(\widehat{p}_i), i=1,2, \ldots,r.
If type=gk_tau, the function computes the subfeatures associated with the estimated Goodman and Kruskal's tau, \frac{\widehat{p}_{ij}(l)^2}{\widehat{p}_j}, i,j=1,2, \ldots,r.
If type=gk_lambda, the function computes the subfeatures associated with the estimated Goodman and Kruskal's lambda, \max_i\widehat{p}_{ij}(l), i=1,2, \ldots,r.
If type=uncertainty_coefficient, the function computes the subfeatures associated with the estimated uncertainty coefficient, \widehat{p}_{ij}(l)\ln\Big(\frac{\widehat{p}_{ij}(l)}{\widehat{p}_i\widehat{p}_j}\Big), i,j=1,2, \ldots,r.
If type=pearson_measure, the function computes the subfeatures associated with the estimated Pearson measure, \frac{(\widehat{p}_{ij}(l)-\widehat{p}_i\widehat{p}_j)^2}{\widehat{p}_i\widehat{p}_j}, i,j=1,2, \ldots,r.
If type=phi2_measure, the function computes the subfeatures associated with the estimated Phi2 measure, \frac{(\widehat{p}_{ij}(l)-\widehat{p}_i\widehat{p}_j)^2}{\widehat{p}_i\widehat{p}_j}, i,j=1,2, \ldots,r.
If type=sakoda_measure, the function computes the subfeatures associated with the estimated Sakoda measure, \frac{(\widehat{p}_{ij}(l)-\widehat{p}_i\widehat{p}_j)^2}{\widehat{p}_i\widehat{p}_j}, i,j=1,2, \ldots,r.
If type=cramers_vi, the function computes the subfeatures associated with the estimated Cramer's vi, \frac{(\widehat{p}_{ij}(l)-\widehat{p}_i\widehat{p}_j)^2}{\widehat{p}_i\widehat{p}_j}, i,j=1,2, \ldots,r.
If type=cohens_kappa, the function computes the subfeatures associated with the estimated Cohen's kappa, \widehat{p}_{ii}(l)-\widehat{p}_i^2, i=1,2, \ldots,r.
If type=total_correlation, the function computes the subfeatures associated with the total correlation, \widehat{\psi}_{ij}(l), i,j=1,2, \ldots,r (see type='total_mixed_cor' in the function calculate_features).
If type=total_mixed_correlation_1, the function computes the subfeatures associated with the total mixed l-correlation, \widehat{\psi}_{i}(l), i=1,2, \ldots,r (see type='total_mixed_correlation_1' in the function calculate_features).
If type=total_mixed_correlation_2, the function computes the subfeatures associated with the total mixed q-correlation, \int_{0}^{1}\widehat{\psi}^\rho_{i}(l)^2d\rho, i=1,2, \ldots,r (see type='total_mixed_correlation_2' in the function calculate_features).

Value

The corresponding subfeature

Author(s)

Ángel López-Oriona, José A. Vilar

References

Weiß CH, Göb R (2008). “Measuring serial dependence in categorical time series.” AStA Advances in Statistical Analysis, 92, 71–89.

Examples

sequence_1 <- GeneticSequences[which(GeneticSequences$Series==1),]
suc <- calculate_subfeatures(series = sequence_1, type = 'uncertainty_coefficient')
# Computing the subfeatures associated with the uncertainty coefficient
# for the first series in dataset GeneticSequences
scv <- calculate_subfeatures(series = sequence_1, type = 'cramers_vi' )
# Computing the subfeatures associated with the cramers vi
# for the first series in dataset GeneticSequences

[Package ctsfeatures version 1.2.2 Index]