revisedsil {RSKC} | R Documentation |
The revised silhouette
Description
This function returns a revised silhouette plot, cluster centers in weighted squared Euclidean distances and a matrix containing the weighted squared Euclidean distances between cases and each cluster center. Missing values are adjusted.
Usage
revisedsil(d,reRSKC=NULL,CASEofINT=NULL,col1="black",
CASEofINT2 = NULL, col2="red", print.plot=TRUE,
W=NULL,C=NULL,out=NULL)
Arguments
d |
A numerical data matrix, |
reRSKC |
A list output from RSKC function. |
CASEofINT |
Necessary if print.plot=TRUE.
A vector of the case indices that appear in the revised silhouette plot.
The revised silhouette widths of these indices are colored in |
col1 |
See |
CASEofINT2 |
A vector of the case indices that appear in the revised silhouette plot.
The indices are colored in |
col2 |
See |
print.plot |
If |
W |
Necessary if |
C |
Necessary if |
out |
Necessary if |
Value
trans.mu |
Cluster centers in reduced weighted dimension. See example for more detail. |
WdisC |
|
sil.order |
Silhouette values of each case in the order of the case index. |
sil.i |
Silhouette values of cases ranked by decreasing order within clusters.
The corresponding case index are in |
Author(s)
Yumi Kondo <y.kondo@stat.ubc.ca>
References
Yumi Kondo (2011), Robustificaiton of the sparse K-means clustering algorithm, MSc. Thesis, University of British Columbia http://hdl.handle.net/2429/37093
Examples
# little simulation function
sim <-
function(mu,f){
D<-matrix(rnorm(60*f),60,f)
D[1:20,1:50]<-D[1:20,1:50]+mu
D[21:40,1:50]<-D[21:40,1:50]-mu
return(D)
}
### output trans.mu ###
p<-200;ncl<-3
# simulate a 60 by p data matrix with 3 classes
d<-sim(2,p)
# run RSKC
re<-RSKC(d,ncl,L1=2,alpha=0.05)
# cluster centers in weighted squared Euclidean distances by function sil
sil.mu<-revisedsil(d,W=re$weights,C=re$labels,out=re$oW,print.plot=FALSE)$trans.mu
# calculation
trans.d<-sweep(d[,re$weights!=0],2,sqrt(re$weights[re$weights!=0]),FUN="*")
class<-re$labels;class[re$oW]<-ncl+1
MEANs<-matrix(NA,ncl,ncol(trans.d))
for ( i in 1 : 3) MEANs[i,]<-colMeans(trans.d[class==i,,drop=FALSE])
sil.mu==MEANs
# coincides
### output WdisC ###
p<-200;ncl<-3;N<-60
# generate 60 by p data matrix with 3 classes
d<-sim(2,p)
# run RSKC
re<-RSKC(d,ncl,L1=2,alpha=0.05)
si<-revisedsil(d,W=re$weights,C=re$labels,out=re$oW,print.plot=FALSE)
si.mu<-si$trans.mu
si.wdisc<-si$WdisC
trans.d<-sweep(d[,re$weights!=0],2,sqrt(re$weights[re$weights!=0]),FUN="*")
WdisC<-matrix(NA,N,ncl)
for ( i in 1 : ncl) WdisC[,i]<-rowSums(scale(trans.d,center=si.mu[i,],scale=FALSE)^2)
# WdisC and si.wdisc coincides