R: genModelFD

genModelFD {TFunHDDC}

R Documentation

genModelFD

Description

Generate functional data with coefficients distributed according to a finite mixture of contamined normal distributions such that for the \textit{k}th cluster we have the multivariate contaminated normal distribution with density

f(\gamma_i;\theta_k)=\alpha_k\phi(\gamma_i;\mu_k,\Sigma_k)+(1-\alpha_k)\phi(\gamma_i;\mu_k,\eta_k\Sigma_k)

where \alpha_k\in (0.5,1) represents the proportion of uncontaminated data, \eta_k>1 is the inflation coefficient due to outliers, and \phi(\gamma_i;\mu_k,\Sigma_k) is the density for the multivariate normal distribution N(\mu_k,\Sigma_k).

Usage

genModelFD(ncurves=1000, nsplines=35, alpha=c(0.9,0.9,0.9),
           eta=c(10, 5, 15))

Arguments

`ncurves`	The number of curves total for the simulation.
`nsplines`	The number of splines to fit to the simulated data.
`alpha`	The proportion of uncontaminated data in each group.
`eta`	The inflation coefficient that measures the increase in variability due to the outliers.

Details

The data are generate from the model FCLM[a_k, b_k,{\bf{Q}}_k,d_k,\alpha_k,\eta_k]. The number of clusters is fixed to K=3 and the mixing proportions are equal \pi_1=\pi_2=\pi_3=1/3. We consider the following values of the parameters

Group 1:d=5, a=150, b=5, \mu=(1,0,50,100,0,\ldots,0)

Group 2: d=20, a=15, b=8, \mu=(0,0,80,0,40,2,0,\ldots,0)

Group 3: d=10, a=30, b=10, \mu=(0,\ldots,0,20,0,80,0,0,100),

where d is the intrinsic dimension of the subgroups, \mu is the mean vector of size 70, a is the values of the d-first diagonal elements of \mathbf{D}, and b the value of the last 70-d- elements. Curves as smoothed using 35 Fourier basis functions.

Value

`fd`	A functional data object representing the simulated data.
`groupd`	Group classifications for each curve.

Author(s)

Cristina Anton and Iain Smith

References

- Amovin-Assagba M, Gannaz I, Jacques J (2022) Outlier detection in multivariate functional data through a contaminated mixture model. Comput Stat Data Anal 174. - Cristina Anton, Iain Smith Model-based clustering of functional data via mixtures of t distributions. Advances in Data Analysis and Classification (to appear).

Examples

# Univariate Contaminated Data
data <- genModelFD(ncurves=300, nsplines=35, alpha=c(0.9,0.9,0.9),
                  eta=c(10, 7, 17))
plot(data$fd, col = data$groupd)
clm <- data$groupd

[Package TFunHDDC version 1.0.1 Index]