genModelFD {TFunHDDC} | R Documentation |
genModelFD
Description
Generate functional data with coefficients distributed according to a finite mixture of contamined normal distributions such that for the \textit{k}
th cluster we have the
multivariate contaminated normal distribution with density
f(\gamma_i;\theta_k)=\alpha_k\phi(\gamma_i;\mu_k,\Sigma_k)+(1-\alpha_k)\phi(\gamma_i;\mu_k,\eta_k\Sigma_k)
where \alpha_k\in (0.5,1)
represents the proportion of uncontaminated data, \eta_k>1
is the inflation coefficient due to outliers, and \phi(\gamma_i;\mu_k,\Sigma_k)
is the density for the multivariate normal distribution N(\mu_k,\Sigma_k)
.
Usage
genModelFD(ncurves=1000, nsplines=35, alpha=c(0.9,0.9,0.9),
eta=c(10, 5, 15))
Arguments
ncurves |
The number of curves total for the simulation. |
nsplines |
The number of splines to fit to the simulated data. |
alpha |
The proportion of uncontaminated data in each group. |
eta |
The inflation coefficient that measures the increase in variability due to the outliers. |
Details
The data are generate from the model FCLM[a_k, b_k,{\bf{Q}}_k,d_k,\alpha_k,\eta_k]
.
The number of clusters is fixed to K=3
and the mixing proportions are equal \pi_1=\pi_2=\pi_3=1/3
. We consider the following values of the parameters
Group 1:d=5
, a=150
, b=5
, \mu=(1,0,50,100,0,\ldots,0)
Group 2: d=20
, a=15
, b=8
, \mu=(0,0,80,0,40,2,0,\ldots,0)
Group 3: d=10
, a=30
, b=10
, \mu=(0,\ldots,0,20,0,80,0,0,100)
,
where d
is the intrinsic dimension of the subgroups, \mu
is the mean vector of size 70, a
is the values of the d
-first diagonal elements of \mathbf{D}
, and b
the value of the last 70-d
- elements. Curves as smoothed using 35 Fourier basis functions.
Value
fd |
A functional data object representing the simulated data. |
groupd |
Group classifications for each curve. |
Author(s)
Cristina Anton and Iain Smith
References
- Amovin-Assagba M, Gannaz I, Jacques J (2022) Outlier detection in multivariate
functional data through a contaminated mixture model. Comput Stat
Data Anal 174.
- Cristina Anton, Iain Smith Model-based clustering of functional data via mixtures of t
distributions. Advances in Data Analysis and Classification (to appear).
Examples
# Univariate Contaminated Data
data <- genModelFD(ncurves=300, nsplines=35, alpha=c(0.9,0.9,0.9),
eta=c(10, 7, 17))
plot(data$fd, col = data$groupd)
clm <- data$groupd