ppgmmga {ppgmmga} | R Documentation |
Projection pursuit based on Gaussian mixtures and evolutionary algorithms for data visualisation
Description
A Projection Pursuit (PP) method for dimension reduction seeking "interesting" data structures in low-dimensional projections. A negentropy index is computed from the density estimated using Gaussian Mixture Models (GMMs). Then, the PP index is maximised by Genetic Algorithms (GAs) to find the optimal projection basis.
Usage
ppgmmga(data,
d,
approx = c("UT", "VAR", "SOTE", "none"),
center = TRUE,
scale = TRUE,
GMM = NULL,
gatype = c("ga", "gaisl"),
options = ppgmmga.options(),
seed = NULL,
verbose = interactive(), ...)
Arguments
data |
A | ||||||||
d |
An integer specifying the dimension of the subspace onto which the data are projected and visualised. | ||||||||
approx |
A string specifying the type of computation to perform to obtain the negentropy for GMMs. Possible values are:
| ||||||||
center |
A logical value indicating whether or not the data are centred. By default is set to | ||||||||
scale |
A logical value indicating whether or not the data are scaled. By default is set to | ||||||||
GMM |
An object of class | ||||||||
gatype |
A string specifying the type of genetic algoritm to be used to maximised the negentropy. Possible values are:
| ||||||||
options |
A list of options containing all the important arguments to pass to | ||||||||
seed |
An integer value with the random number generator state. It may be used to replicate the results of ppgmmga algorithm. | ||||||||
verbose |
A logical value controlling if the evolution of GA search is shown. By default is | ||||||||
... |
Further arguments passed to or from other methods. |
Details
Projection pursuit (PP) is a features extraction method for analysing high-dimensional data with low-dimension projections by maximising a projection index to find out the best orthogonal projections. A general PP procedure can be summarised in few steps: the data may be transformed, the PP index is chosen and the subspace dimension is fixed. Then, the PP index is optimised.
For clusters visualisation the negentropy index is considerd. Since such index requires an estimation of the underling data density, Gaussian mixture models (GMMs) are used to approximate such density. Genetic Algorithms are then employed to maximise the negentropy with respect to the basis of the projection subspace.
Value
Returns an object of class 'ppgmmga'
. See ppgmmga-class
for a description of the object.
Author(s)
Serafini A. srf.alessio@gmail.com
Scrucca L. luca.scrucca@unipg.it
References
Scrucca, L. and Serafini, A. (2019) Projection pursuit based on Gaussian mixtures and evolutionary algorithms. Journal of Computational and Graphical Statistics, 28:4, 847–860. DOI: 10.1080/10618600.2019.1598871
See Also
summary.ppgmmga
, plot.ppgmmga
, ppgmmga-class
Examples
## Not run:
data(iris)
X <- iris[,-5]
Class <- iris$Species
# 1-dimensional PPGMMGA
PP1D <- ppgmmga(data = X, d = 1)
summary(PP1D)
plot(PP1D, bins = 11)
plot(PP1D, bins = 11, Class)
# 2-dimensional PPGMMGA
PP2D <- ppgmmga(data = X, d = 2)
summary(PP2D)
plot(PP2D)
plot(PP2D, Class)
## Unscented Transformation approximation
PP2D_1 <- ppgmmga(data = X, d = 2, approx = "UT")
summary(PP2D_1)
plot(PP2D_1, Class)
## VARiational approximation
PP2D_2 <- ppgmmga(data = X, d = 2, approx = "VAR")
summary(PP2D_2)
plot(PP2D_2, Class)
## Second Order Taylor Expansion approximation
PP2D_3 <- ppgmmga(data = X, d = 2, approx = "SOTE")
summary(PP2D_3)
plot(PP2D_3, Class)
# 3-dimensional PPGMMGA
PP3D <- ppgmmga(data = X, d = 3,)
summary(PP3D)
plot(PP3D, Class)
# A rotating 3D plot can be obtained using:
# if(!require("msir")) install.packages("msir")
# msir::spinplot(PP3D$Z, markby = Class,
# col.points = ppgmmga.options("classPlotColors")[1:3])
## End(Not run)