moc.gabk {moc.gapbk} | R Documentation |
Perform the Multi-Objective Clustering Algorithm Guided by a-Priori Biological Knowledge (MOC-GaPBK)
Description
This function receives two distance matrices and it performs the MOC-GaPBK.
Usage
moc.gabk(
dmatrix1,
dmatrix2,
num_k,
generation = 50,
pop_size = 10,
rat_cross = 0.8,
rat_muta = 0.01,
tour_size = 2,
neighborhood = 0.1,
local_search = FALSE,
cores = 2
)
Arguments
dmatrix1 |
A distance matrix. It should have the same dimensions that dmatrix2. It is mandatory. |
dmatrix2 |
A distance matrix. It should have the same dimensions that dmatrix1. It is mandatory. |
num_k |
The number k of groups represented by medoids in each individual. It is mandatory. |
generation |
Number of generations to be performed by MOC-GaPBK. By default 50. |
pop_size |
Size of population. By default 10. |
rat_cross |
Probability of crossover. By default 0.80. |
rat_muta |
Probability of mutation. By default 0.01. |
tour_size |
Size of tournament. By default 2. |
neighborhood |
Percentage of neighborhood. A real value between 0 and 1. It is computed as neighborhood*pop_size to determine the size of neighborhood. By default 0.10. |
local_search |
A boolean value indicating whether the local searches procedures (PR and PLS) are computed. By default FALSE. |
cores |
Number of cores to be used to compute the local searches procedures. By default 2. |
Details
MOC-GaPBK is a method proposes by Parraga-Alava, J. et. al. 2018. It carries out the discovery of clusters using NSGA-II algorithm along with Path-Relinking (PR) and Pareto Local Search (PLS) as intensification and diversification strategies, respectively. The algorithm uses as objective functions two versions of the Xie-Beni validity index, i.e., a version for each distance matrix (dmatrix1, dmatrix2). More details about this compute can be found in: <https://doi.org/10.1186/s13040-018-0178-4>. MOC-GaPBK yield a set of the best clustering solutions from a multi-objective point of views.
Value
population |
The population of medoids including the objective functions values and order by Pareto ranking and crowding distance values. |
matrix.solutions |
A matrix with results of clustering. Each column represents a clustering solution available in Pareto front. |
clustering |
A list containing named vectors of integers from 1:k representing the cluster to which each object is assigned. |
Author(s)
Jorge Parraga-Alava, Marcio Dorn, Mario Inostroza-Ponta
References
J. Parraga-Alava, M. Dorn, M. Inostroza-Ponta (2018). A multi-objective gene clustering algorithm guided by apriori biological knowledge with intensification and diversification strategies. BioData Mining. 11(1) 1-16.
K. Deb, A. Pratap, S. Agarwal, T. Meyarivan (2002). A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Transactions on Evolutionary Computation, 6(2) 182-197.
F. Glover (1997). Tabu Search and Adaptive Memory Programming - Advances, Applications and Challenges. Interfaces in Computer Science and Operations Research: Advances in Metaheuristics, Optimization, and Stochastic Modeling Technologies. 1-75.
J. Dubois-Lacoste, M. Lopez-Ibanez, Stutzle, T. (2015). Anytime Pareto local search. European Journal of Operational Research, 243(2) 369-385.
Examples
##Generates a data matrix of dimension 50X20
library("amap")
library("moc.gapbk")
x <- matrix(runif(50 * 20, min = -5, max = 10), nrow = 50, ncol = 20)
##Compute two distance matrices
dmatrix1<- as.matrix(amap::Dist(x, method = "euclidean"))
dmatrix2<- as.matrix(amap::Dist(x, method = "correlation"))
##Performs MOC-GaPBK with 5 cluster
example<-moc.gabk(dmatrix1, dmatrix2, 5)
example$population
example$matrix.solutions
example$clustering