genModelFD {TFunHDDC}R Documentation

genModelFD

Description

Generate functional data with coefficients distributed according to a finite mixture of contamined normal distributions such that for the \textit{k}th cluster we have the multivariate contaminated normal distribution with density

f(\gamma_i;\theta_k)=\alpha_k\phi(\gamma_i;\mu_k,\Sigma_k)+(1-\alpha_k)\phi(\gamma_i;\mu_k,\eta_k\Sigma_k)

where \alpha_k\in (0.5,1) represents the proportion of uncontaminated data, \eta_k>1 is the inflation coefficient due to outliers, and \phi(\gamma_i;\mu_k,\Sigma_k) is the density for the multivariate normal distribution N(\mu_k,\Sigma_k).

Usage

genModelFD(ncurves=1000, nsplines=35, alpha=c(0.9,0.9,0.9),
           eta=c(10, 5, 15))

Arguments

ncurves

The number of curves total for the simulation.

nsplines

The number of splines to fit to the simulated data.

alpha

The proportion of uncontaminated data in each group.

eta

The inflation coefficient that measures the increase in variability due to the outliers.

Details

The data are generate from the model FCLM[a_k, b_k,{\bf{Q}}_k,d_k,\alpha_k,\eta_k]. The number of clusters is fixed to K=3 and the mixing proportions are equal \pi_1=\pi_2=\pi_3=1/3. We consider the following values of the parameters

Group 1:d=5, a=150, b=5, \mu=(1,0,50,100,0,\ldots,0)

Group 2: d=20, a=15, b=8, \mu=(0,0,80,0,40,2,0,\ldots,0)

Group 3: d=10, a=30, b=10, \mu=(0,\ldots,0,20,0,80,0,0,100),

where d is the intrinsic dimension of the subgroups, \mu is the mean vector of size 70, a is the values of the d-first diagonal elements of \mathbf{D}, and b the value of the last 70-d- elements. Curves as smoothed using 35 Fourier basis functions.

Value

fd

A functional data object representing the simulated data.

groupd

Group classifications for each curve.

Author(s)

Cristina Anton and Iain Smith

References

- Amovin-Assagba M, Gannaz I, Jacques J (2022) Outlier detection in multivariate functional data through a contaminated mixture model. Comput Stat Data Anal 174. - Cristina Anton, Iain Smith Model-based clustering of functional data via mixtures of t distributions. Advances in Data Analysis and Classification (to appear).

Examples

# Univariate Contaminated Data
data <- genModelFD(ncurves=300, nsplines=35, alpha=c(0.9,0.9,0.9),
                  eta=c(10, 7, 17))
plot(data$fd, col = data$groupd)
clm <- data$groupd

[Package TFunHDDC version 1.0.1 Index]