GPDC {FPDclustering} | R Documentation |
Gaussian PD-Clustering
Description
An implementation of Gaussian PD-Clustering GPDC, an extention of PD-clustering adjusted for cluster size that uses a dissimilarity measure based on the Gaussian density.
Usage
GPDC(data=NULL,k=2,ini="kmedoids", nr=5,iter=100)
Arguments
data |
A matrix or data frame such that rows correspond to observations and columns correspond to variables. |
k |
A numerical parameter giving the number of clusters |
ini |
A parameter that selects center starts. Options available are random ("random"), kmedoid ("kmedoid", by default), and PDC ("PDclust"). |
nr |
Number of random starts when ini set to "random" |
iter |
Maximum number of iterations |
Value
A class FPDclustering list with components
label |
A vector of integers indicating the cluster membership for each unit |
centers |
A matrix of cluster means |
sigma |
A list of K elements, with the variance-covariance matrix per cluster |
probability |
A matrix of probability of each point belonging to each cluster |
JDF |
The value of the Joint distance function |
iter |
The number of iterations |
data |
the data set |
Author(s)
Cristina Tortora and Francesco Palumbo
References
Tortora C., McNicholas P.D., and Palumbo F. A probabilistic distance clustering algorithm using Gaussian and Student-t multivariate density distributions. SN Computer Science, 1:65, 2020.
C. Rainey, C. Tortora and F.Palumbo. A parametric version of probabilistic distance clustering. In: Greselin F., Deldossi L., Bagnato L., Vichi M. (eds) Statistical Learning of Complex Data. CLADAG 2017. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Cham, 33-43 2019. doi.org/10.1007/978-3-030-21140-0_4
See Also
Examples
#Load the data
data(ais)
dataSEL=ais[,c(10,3,5,8)]
#Clustering
res=GPDC(dataSEL,k=2,ini = "kmedoids")
#Results
table(res$label,ais$sex)
plot(res)
summary(res)