cecm {evclust} | R Documentation |
Constrained Evidential c-means algorithm
Description
cecm
computes a credal partition from a matrix of attribute data and
pairwise constraints using the Constrained Evidential c-means (CECM) algorithm.
Usage
cecm(
x,
c,
type = "full",
pairs = NULL,
ntrials = 1,
ML,
CL,
g0 = NULL,
alpha = 1,
delta = 10,
xi = 0.5,
distance = 0,
epsi = 0.001,
disp = TRUE
)
Arguments
x |
input matrix of size n x d, where n is the number of objects and d the number of attributes. |
c |
Number of clusters. |
type |
Type of focal sets ("simple": empty set, singletons and Omega;
"full": all |
pairs |
Set of pairs to be included in the focal sets; if NULL, all pairs are included. Used only if type="pairs". |
ntrials |
Number of runs of the optimization algorithm (set to 1 if |
ML |
Matrix nbML x 2 of must-link constraints. Each row of ML contains the indices of objects that belong to the same class. |
CL |
Matrix nbCL x 2 of cannot-link constraints. Each row of CL contains the indices of objects that belong to different classes. |
g0 |
Initial prototypes, matrix of size c x d. If not supplied, the prototypes are initialized randomly. |
alpha |
Exponent of the cardinality in the cost function. |
delta |
Distance to the empty set. |
xi |
Tradeoff between the objective function Jecm and the constraints: Jcecm=(1-xi)Jecm + xi Jconst. |
distance |
Type of distance use: 0=Euclidean, 1=Mahalanobis. |
epsi |
Minimum amount of improvement. |
disp |
If TRUE (default), intermediate results are displayed. |
Details
CECM is a version of ECM allowing the user to specify pairwise constraints to guide the clustering process. Pairwise constraints are of two kinds: must-link contraints are pairs of objects that are known to belong to the same class, and cannot-link constraints are pairs of objects that are known to belong to different classes. CECM can also learn a metric for each cluster, like the Gustafson-Kessel algorithm in fuzzy clustering. At each iteration, the algorithm solves a quadratic programming problem using an interior ellipsoidal trust region and barrier function algorithm with dual solution updating technique in the standard QP form (Ye, 1992).
If initial prototypes g0
are provided, the number of trials is automatically set to 1.
Remark: Due to the use of the Matrix package, messages may be generated by R's (S4) method dispatch mechanism. They are not error messages, and they can be ignored.
Value
The credal partition (an object of class "credpart"
).
Author(s)
Thierry Denoeux (from a MATLAB code written by Violaine Antoine).
References
V. Antoine, B. Quost, M.-H. Masson and T. Denoeux. CECM: Constrained Evidential C-Means algorithm. Computational Statistics and Data Analysis, Vol. 56, Issue 4, pages 894–914, 2012.
Y. Ye. On affine-scaling algorithm for nonconvex quadratic programming. Math. Programming 56 (1992) 285–300.
See Also
create_MLCL
, makeF
, extractMass
,
ecm
, recm
Examples
## Generation of a two-class dataset
## Not run:
n<-30
x<-cbind(0.2*rnorm(n),rnorm(n))
y<-c(rep(1,n/2),rep(2,n/2))
x[(n/2+1):n,1]<-x[(n/2+1):n,1]+1
plot(x[,1],x[,2],asp=1,pch=y,col=y)
## Generation of 10 constraints
const<-create_MLCL(y,nbConst=10)
## Call of cecm
clus<-cecm(x=x,c=2,ML=const$M,CL=const$CL,delta=10)
plot(x[,1],x[,2],asp=1,pch=clus$y.pl,col=y)
## End(Not run)