EkNNclus {evclust} | R Documentation |
EkNNclus algorithm
Description
EkNNclus
computes hard and credal partitions from dissimilarity or attribute
data using the EkNNclus algorithm.
Usage
EkNNclus(
x = NULL,
D,
K,
y0,
ntrials = 1,
q = 0.5,
b = 1,
disp = TRUE,
tr = FALSE,
eps = 1e-06
)
Arguments
x |
n x p data matrix (n instances, p attributes). |
D |
n x n dissimilarity matrix (used only if x is not supplied). |
K |
Number of neighbors. |
y0 |
Initial partition (vector of length n, with values in (1,2,...)). |
ntrials |
Number of runs of the algorithm (the best solution is kept). |
q |
Parameter in (0,1). Gamma is set to the inverse of the q-quantile of distances from the K nearest neighbors (same notation as in the paper). |
b |
Exponent of distances, |
disp |
If TRUE, intermediate results are displayed. |
tr |
If TRUE, a trace of the cost function is returned. |
eps |
Minimal distance between two vectors (distances smaller than |
Details
The number of clusters is not specified. It is influenced by parameters K and q. (It is advised to start with the default values.) For n not too large (say, until one thousand), y0 can be defined as the vector (1,2,...,n). For larger values of n, it is advised to start with a random partition of c clusters, c<n.
Value
The credal partition (an object of class "credpart"
). In addition to the
usual attributes, the output credal partition has the following attributes:
- trace
Trace of the algorithm (sequence of values of the cost function).
- W
The weight matrix.
Author(s)
Thierry Denoeux.
References
T. Denoeux, O. Kanjanatarakul and S. Sriboonchitta. EK-NNclus: a clustering procedure based on the evidential K-nearest neighbor rule. Knowledge-Based Systems, Vol. 88, pages 57–69, 2015.
Examples
## Clustering of the fourclass dataset
## Not run:
data(fourclass)
n<-nrow(fourclass)
N=2
clus<- EkNNclus(fourclass[,1:2],K=60,y0=(1:n),ntrials=N,q=0.9,b=2,disp=TRUE,tr=TRUE)
## Plot of the partition
plot(clus,X=fourclass[,1:2],ytrue=fourclass$y,Outliers=FALSE,plot_approx=FALSE)
## Plot of the cost function vs number of iteration
L<-vector(length=N)
for(i in 1:N) L[i]<-dim(clus$trace[clus$trace[,1]==i,])[1]
imax<-which.max(L)
plot(0:(L[imax]-1),-clus$trace[clus$trace[,1]==imax,3],type="l",lty=imax,
xlab="time steps",ylab="energy")
for(i in (1:N)) if(i != imax) lines(0:(L[i]-1),-clus$trace[clus$trace[,1]==i,3],
type="l",lty=i)
## End(Not run)