| data_cls {etree} | R Documentation |
Classification toy dataset
Description
A simple dataset containing simulated values for a nominal response variable and four covariates of both mixed and partially structured type. The data generation process is based on Example 4.7 (”Signal shape classification”, pages 73-77) from Saito (1994).
Usage
data_cls
Format
List with two elements: covs, which is a list containing the
covariates, and resp, which is a factor of length 150 representing
the response variable. The response variable is divided into three classes
whose labels are cylinder (Cyl), bell (Bel) and funnel
(Fun). The four covariates in covs all have length 150 and
are characterized as follows:
Nominal:
Cylobservations are given level 1 with probability 0.8 and levels 2 and 3 with probability 0.1 each,Belobservations are given level 2 with probability 0.8 and levels 1 and 3 with probability 0.1 each,Funobservations are given level 3 with probability 0.8 and levels 1 and 2 with probability 0.1 each;Numeric: coefficients for one of the basis used to perform the B-splines expansion of the curves that are in turn specified as in Saito (1994);
Functional: curves as specified in Saito (1994);
Graphs: Erd\"os-R\'enyi graphs with connection probability 0.10 for
Cylobservations, 0.125 forBelobservations, 0.15 forFunobservations.
References
Saito, N. (1994). Local feature extraction and its applications using a library of bases (Doctoral dissertation, Yale University).