SpokenArabicDigits {mlmts}R Documentation

SpokenArabicDigits

Description

Multivariate time series (MTS) involving sound of 44 males and 44 females Arabic native speakers between the ages of 18 and 40. The 13 Mel Frequency Cepstral Coefficients (MFCCs) were computed.

Usage

data(SpokenArabicDigits)

Format

A list with two elements, which are:

data

A list with 8798 MTS.

classes

A numeric vector indicating the corresponding classes associated with the elements in data.

Details

Each element in data is a matrix formed by 93 rows (time points) indicating time recordings and 13 columns (variables) indicating different MFCCs. The first 6599 elements correspond to the training set, whereas the last 2199 elements correspond to the test set. The numeric vector classes is formed by integers from 1 to 10, indicating that there are 10 different classes in the database. Each class is associated with a different spoken arabic digit. For more information, see Bagnall et al. (2018). Run "install.packages("ueadata2", repos="https://anloor7.github.io/drat")" to access this dataset and use the syntax "ueadata2::SpokenArabicDigits".

References

Bagnall A, Dau HA, Lines J, Flynn M, Large J, Bostrom A, Southam P, Keogh E (2018). “The UEA multivariate time series classification archive, 2018.” arXiv preprint arXiv:1811.00075.

Ruiz AP, Flynn M, Large J, Middlehurst M, Bagnall A (2021). “The great multivariate time series classification bake off: a review and experimental evaluation of recent algorithmic advances.” Data Mining and Knowledge Discovery, 35(2), 401–449.

Bagnall A, Lines J, Vickers W, Keogh E (2022). “The UEA & UCR Time Series Classification Repository.” www.timeseriesclassification.com.


[Package mlmts version 1.1.1 Index]