MGHFA {MixGHD} | R Documentation |
Mixture of generalized hyperbolic factor analyzers (MGHFA).
Description
Carries out model-based clustering and classification using the mixture of generalized hyperbolic factor analyzers.
Usage
MGHFA(data=NULL, gpar0=NULL, G=2, max.iter=100,
label =NULL ,q=2,eps=1e-2 , method="kmeans", scale=TRUE ,nr=10)
Arguments
data |
A matrix or data frame such that rows correspond to observations and columns correspond to variables. |
gpar0 |
(optional) A list containing the initial parameters of the mixture model. See the 'Details' section. |
G |
The range of values for the number of clusters. |
max.iter |
(optional) A numerical parameter giving the maximum number of iterations each EM algorithm is allowed to use. |
label |
( optional) A n dimensional vector, if label[i]=k then observation i belongs to group k, If label[i]=0 then observation i has no known group, if NULL then the data has no known groups. |
q |
The range of values for the number of factors. |
eps |
(optional) A number specifying the epsilon value for the convergence criteria used in the EM algorithms. For each algorithm, the criterion is based on the difference between the log-likelihood at an iteration and an asymptotic estimate of the log-likelihood at that iteration. This asymptotic estimate is based on the Aitken acceleration. |
method |
( optional) A string indicating the initialization criterion, if not specified kmeans clustering is used. Alternative methods are: hierarchical "hierarchical" and model based "modelBased" clustering |
scale |
( optional) A logical value indicating whether or not the data should be scaled, true by default. |
nr |
( optional) A number indicating the number of starting value when random is used, 10 by default. |
Details
The arguments gpar0, if specified, is a list structure containing at least one p dimensional vector mu, alpha and phi, a pxp matrix gamma, a 2 dimensional vector cpl containing omega and lambda.
Value
A S4 object of class MixGHD with slots:
Index |
Bayesian information criterion value for each combination of G and q. |
BIC |
Bayesian information criterion. |
gpar |
A list of the model parameters. |
loglik |
The log-likelihood values. |
map |
A vector of integers indicating the maximum a posteriori classifications for the best model. |
z |
A matrix giving the raw values upon which map is based. |
Author(s)
Cristina Tortora, Aisha ElSherbiny, Ryan P. Browne, Brian C. Franczak, and Paul D. McNicholas. Maintainer: Cristina Tortora <cristina.tortora@sjsu.edu>
References
C.Tortora, P.D. McNicholas, and R.P. Browne (2016). Mixtures of Generalized Hyperbolic Factor Analyzers. Advanced in data analysis and classification 10(4) p.423-440.\ C. Tortora, R. P. Browne, A. ElSherbiny, B. C. Franczak, and P. D. McNicholas (2021). Model-Based Clustering, Classification, and Discriminant Analysis using the Generalized Hyperbolic Distribution: MixGHD R package, Journal of Statistical Software 98(3) 1–24, <doi:10.18637/jss.v098.i03>.
Examples
## Classification
#70% belong to the training set
data(sonar)
label=sonar[,61]
set.seed(4)
a=round(runif(62)*207+1)
label[a]=0
##model estimation
model=MGHFA(data=sonar[,1:60], G=2, max.iter=25 ,q=2,label=label)
#result
table(model@map,sonar[,61])
summary(model)