| cluspca {clustrd} | R Documentation |
Joint dimension reduction and clustering of continuous data.
Description
This function implements Factorial K-means (Vichi and Kiers, 2001) and Reduced K-means (De Soete and Carroll, 1994), as well as a compromise version of these two methods. The methods combine Principal Component Analysis for dimension reduction with K-means for clustering.
Usage
cluspca(data, nclus, ndim, alpha = NULL, method = c("RKM","FKM"),
center = TRUE, scale = TRUE, rotation = "none", nstart = 100,
smartStart = NULL, seed = NULL)
## S3 method for class 'cluspca'
print(x, ...)
## S3 method for class 'cluspca'
summary(object, ...)
## S3 method for class 'cluspca'
fitted(object, mth = c("centers", "classes"), ...)
Arguments
data |
Dataset with metric variables |
nclus |
Number of clusters (nclus = 1 returns the PCA solution |
ndim |
Dimensionality of the solution |
method |
Specifies the method. Options are RKM for reduced K-means and FKM for factorial K-means (default = |
alpha |
Adjusts for the relative importance of RKM and FKM in the objective function; |
center |
A logical value indicating whether the variables should be shifted to be zero centered (default = |
scale |
A logical value indicating whether the variables should be scaled to have unit variance before the analysis takes place (default = |
rotation |
Specifies the method used to rotate the factors. Options are |
nstart |
Number of starts (default = 100) |
smartStart |
If |
seed |
An integer that is used as argument by |
x |
For the |
object |
For the |
mth |
For the |
... |
Not used |
Details
For the K-means part, the algorithm of Hartigan-Wong is used by default.
The hidden print and summary methods print out some key components of an object of class cluspca.
The hidden fitted method returns cluster fitted values. If method is "classes", this is a vector of cluster membership (the cluster component of the "cluspca" object). If method is "centers", this is a matrix where each row is the cluster center for the observation. The rownames of the matrix are the cluster membership values.
When nclus = 1 the function returns the PCA solution and plot(object) shows the corresponding biplot.
Value
obscoord |
Object scores |
attcoord |
Variable scores |
centroid |
Cluster centroids |
cluster |
Cluster membership |
criterion |
Optimal value of the objective function |
size |
The number of objects in each cluster |
scale |
A copy of |
center |
A copy of |
nstart |
A copy of |
odata |
A copy of |
References
De Soete, G., and Carroll, J. D. (1994). K-means clustering in a low-dimensional Euclidean space. In Diday E. et al. (Eds.), New Approaches in Classification and Data Analysis, Heidelberg: Springer, 212-219.
Vichi, M., and Kiers, H.A.L. (2001). Factorial K-means analysis for two-way data. Computational Statistics and Data Analysis, 37, 49-64.
See Also
Examples
#Reduced K-means with 3 clusters in 2 dimensions after 10 random starts
data(macro)
outRKM = cluspca(macro, 3, 2, method = "RKM", rotation = "varimax", scale = FALSE, nstart = 10)
summary(outRKM)
#Scatterplot (dimensions 1 and 2) and cluster description plot
plot(outRKM, cludesc = TRUE)
#Factorial K-means with 3 clusters in 2 dimensions
#with a Reduced K-means starting solution
data(macro)
outFKM = cluspca(macro, 3, 2, method = "FKM", rotation = "varimax",
scale = FALSE, smartStart = outRKM$cluster)
outFKM
#Scatterplot (dimensions 1 and 2) and cluster description plot
plot(outFKM, cludesc = TRUE)
#To get the Tandem approach (PCA(SVD) + K-means)
outTandem = cluspca(macro, 3, 2, alpha = 1, seed = 1234)
plot(outTandem)
#nclus = 1 just gives the PCA solution
#outPCA = cluspca(macro, 1, 2)
#outPCA
#Scatterplot (dimensions 1 and 2)
#plot(outPCA)