EkNNclus {evclust}R Documentation

EkNNclus algorithm

Description

EkNNclus computes hard and credal partitions from dissimilarity or attribute data using the EkNNclus algorithm.

Usage

EkNNclus(
  x = NULL,
  D,
  K,
  y0,
  ntrials = 1,
  q = 0.5,
  b = 1,
  disp = TRUE,
  tr = FALSE,
  eps = 1e-06
)

Arguments

x

n x p data matrix (n instances, p attributes).

D

n x n dissimilarity matrix (used only if x is not supplied).

K

Number of neighbors.

y0

Initial partition (vector of length n, with values in (1,2,...)).

ntrials

Number of runs of the algorithm (the best solution is kept).

q

Parameter in (0,1). Gamma is set to the inverse of the q-quantile of distances from the K nearest neighbors (same notation as in the paper).

b

Exponent of distances, \alpha_{ij} = \phi(d_{ij}^b).

disp

If TRUE, intermediate results are displayed.

tr

If TRUE, a trace of the cost function is returned.

eps

Minimal distance between two vectors (distances smaller than eps are replaced by eps)

Details

The number of clusters is not specified. It is influenced by parameters K and q. (It is advised to start with the default values.) For n not too large (say, until one thousand), y0 can be defined as the vector (1,2,...,n). For larger values of n, it is advised to start with a random partition of c clusters, c<n.

Value

The credal partition (an object of class "credpart"). In addition to the usual attributes, the output credal partition has the following attributes:

trace

Trace of the algorithm (sequence of values of the cost function).

W

The weight matrix.

Author(s)

Thierry Denoeux.

References

T. Denoeux, O. Kanjanatarakul and S. Sriboonchitta. EK-NNclus: a clustering procedure based on the evidential K-nearest neighbor rule. Knowledge-Based Systems, Vol. 88, pages 57–69, 2015.

Examples

## Clustering of the fourclass dataset
## Not run: 
data(fourclass)
n<-nrow(fourclass)
N=2
clus<- EkNNclus(fourclass[,1:2],K=60,y0=(1:n),ntrials=N,q=0.9,b=2,disp=TRUE,tr=TRUE)
## Plot of the partition
plot(clus,X=fourclass[,1:2],ytrue=fourclass$y,Outliers=FALSE,plot_approx=FALSE)
## Plot of the cost function vs number of iteration
L<-vector(length=N)
for(i in 1:N) L[i]<-dim(clus$trace[clus$trace[,1]==i,])[1]
imax<-which.max(L)
plot(0:(L[imax]-1),-clus$trace[clus$trace[,1]==imax,3],type="l",lty=imax,
xlab="time steps",ylab="energy")
for(i in (1:N)) if(i != imax) lines(0:(L[i]-1),-clus$trace[clus$trace[,1]==i,3],
type="l",lty=i)

## End(Not run)

[Package evclust version 2.0.3 Index]