tuneclus {clustrd} | R Documentation |
Cluster quality assessment for a range of clusters and dimensions.
Description
This function facilitates the selection of the appropriate number of clusters and dimensions for joint dimension reduction and clustering methods.
Usage
tuneclus(data, nclusrange = 3:4, ndimrange = 2:3,
method = c("RKM","FKM","mixedRKM","mixedFKM","clusCA","iFCB","MCAk"),
criterion = "asw", dst = "full", alpha = NULL, alphak = NULL,
center = TRUE, scale = TRUE, rotation = "none", nstart = 100,
smartStart = NULL, seed = NULL)
## S3 method for class 'tuneclus'
print(x, ...)
## S3 method for class 'tuneclus'
summary(object, ...)
## S3 method for class 'tuneclus'
fitted(object, mth = c("centers", "classes"), ...)
Arguments
data |
Continuous, Categorical ot Mixed data set |
nclusrange |
An integer vector with the range of numbers of clusters which are to be compared by the cluster validity criteria. Note: the number of clusters should be greater than one |
ndimrange |
An integer vector with the range of dimensions which are to be compared by the cluster validity criteria |
method |
Specifies the method. Options are |
criterion |
One of |
dst |
Specifies the data used to compute the distances between objects. Options are |
alpha |
Adjusts for the relative importance of (mixed) RKM and FKM in the objective function; |
alphak |
Non-negative scalar to adjust for the relative importance of MCA ( |
center |
A logical value indicating whether the variables should be shifted to be zero centered (default = |
scale |
A logical value indicating whether the variables should be scaled to have unit variance before the analysis takes place (default = |
rotation |
Specifies the method used to rotate the factors. Options are none for no rotation, varimax for varimax rotation with Kaiser normalization and promax for promax rotation (default = |
nstart |
Number of starts (default = 100) |
smartStart |
If |
seed |
An integer that is used as argument by |
x |
For the |
object |
For the |
mth |
For the |
... |
Not used |
Details
For the K-means part, the algorithm of Hartigan-Wong is used by default.
The hidden print
and summary
methods print out some key components of an object of class tuneclus
.
The hidden fitted
method returns cluster fitted values. If method is "classes"
, this is a vector of cluster membership (the cluster component of the "tuneclus" object). If method is "centers"
, this is a matrix where each row is the cluster center for the observation. The rownames of the matrix are the cluster membership values.
Value
clusobjbest |
The output of the optimal run of |
nclusbest |
The optimal number of clusters |
ndimbest |
The optimal number of dimensions |
critbest |
The optimal criterion value for |
critgrid |
Matrix of size |
criterion |
"asw" for average Silhouette width or "ch" for "Calinski-Harabasz" |
cluasw |
Average Silhouette width values of each cluster, when criterion = "asw" |
References
Calinski, R.B., and Harabasz, J., (1974). A dendrite method for cluster analysis. Communications in Statistics, 3, 1-27.
Kaufman, L., and Rousseeuw, P.J., (1990). Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, New York.
See Also
global_bootclus
, local_bootclus
Examples
# Reduced K-means for a range of clusters and dimensions
data(macro)
# Cluster quality assessment based on the average silhouette width in the low dimensional space
# nstart = 1 for speed in example
# use more for real applications
bestRKM = tuneclus(macro, 3:4, 2:3, method = "RKM",
criterion = "asw", dst = "low", nstart = 1, seed = 1234)
bestRKM
#plot(bestRKM)
# Cluster Correspondence Analysis for a range of clusters and dimensions
data(bribery)
# Cluster quality assessment based on the Callinski-Harabasz index in the full dimensional space
bestclusCA = tuneclus(bribery, 4:5, 3:4, method = "clusCA",
criterion = "ch", nstart = 20, seed = 1234)
bestclusCA
#plot(bestclusCA, cludesc = TRUE)
# Mixed reduced K-means for a range of clusters and dimensions
data(diamond)
# Cluster quality assessment based on the average silhouette width in the low dimensional space
# nstart = 5 for speed in example
# use more for real applications
bestmixedRKM = tuneclus(diamond[,-7], 3:4, 2:3,
method = "mixedRKM", criterion = "asw", dst = "low",
nstart = 5, seed = 1234)
bestmixedRKM
#plot(bestmixedRKM)