data_cls {etree} | R Documentation |
Classification toy dataset
Description
A simple dataset containing simulated values for a nominal response variable and four covariates of both mixed and partially structured type. The data generation process is based on Example 4.7 (”Signal shape classification”, pages 73-77) from Saito (1994).
Usage
data_cls
Format
List with two elements: covs
, which is a list containing the
covariates, and resp
, which is a factor of length 150 representing
the response variable. The response variable is divided into three classes
whose labels are cylinder (Cyl
), bell (Bel
) and funnel
(Fun
). The four covariates in covs
all have length 150 and
are characterized as follows:
Nominal:
Cyl
observations are given level 1 with probability 0.8 and levels 2 and 3 with probability 0.1 each,Bel
observations are given level 2 with probability 0.8 and levels 1 and 3 with probability 0.1 each,Fun
observations are given level 3 with probability 0.8 and levels 1 and 2 with probability 0.1 each;Numeric: coefficients for one of the basis used to perform the B-splines expansion of the curves that are in turn specified as in Saito (1994);
Functional: curves as specified in Saito (1994);
Graphs: Erd\"os-R\'enyi graphs with connection probability 0.10 for
Cyl
observations, 0.125 forBel
observations, 0.15 forFun
observations.
References
Saito, N. (1994). Local feature extraction and its applications using a library of bases (Doctoral dissertation, Yale University).