MESDIF {SDEFSR} | R Documentation |
Multiobjective Evolutionary Subgroup DIscovery Fuzzy rules (MESDIF) Algorithm
Description
Performs a subgroup discovery task executing the MESDIF algorithm.
Usage
MESDIF(
paramFile = NULL,
training = NULL,
test = NULL,
output = c("optionsFile.txt", "rulesFile.txt", "testQM.txt"),
seed = 0,
nLabels = 3,
nEval = 10000,
popLength = 100,
eliteLength = 3,
crossProb = 0.6,
mutProb = 0.01,
RulesRep = "can",
Obj1 = "CSUP",
Obj2 = "CCNF",
Obj3 = "null",
Obj4 = "null",
targetVariable = NA,
targetClass = "null"
)
Arguments
paramFile |
The path of the parameters file. |
training |
A |
test |
A |
output |
character vector with the paths where store information file, rules file and quality measures file, respectively. |
seed |
An integer to set the seed used for generate random numbers. |
nLabels |
Number of linguistic labels that represents numerical variables. |
nEval |
An integer for set the maximum number of evaluations in the evolutive process. Large values of this parameter increments the computing time. |
popLength |
An integer to set the number of individuals in the population. |
eliteLength |
An integer to set the number of individuals in the elite population. |
crossProb |
Sets the crossover probability. A number in [0,1]. |
mutProb |
Sets the mutation probability. A number in [0,1]. |
RulesRep |
Representation used in the rules. "can" for canonical rules, "dnf" for DNF rules. |
Obj1 |
Sets the Objective number 1. See |
Obj2 |
Sets the Objective number 2. See |
Obj3 |
Sets the Objective number 3. See |
Obj4 |
Sets the Objective number 4. See |
targetVariable |
The name or index position of the target variable (or class). It must be a categorical one. |
targetClass |
A string specifing the value of the target variable. |
Details
This function sets as target variable the last one that appear in SDEFSR_Dataset
object. If you want
to change the target variable, you can set the targetVariable
to change this target variable.
The target variable MUST be categorical, if it is not, throws an error. Also, the default behaviour is to find
rules for all possible values of the target varaible. targetClass
sets a value of the target variable where the
algorithm only finds rules about this value.
If you specify in paramFile
something distinct to NULL
the rest of the parameters are
ignored and the algorithm tries to read the file specified. See "Parameters file structure" below
if you want to use a parameters file.
Value
The algorithm shows in the console the following results:
The parameters used in the algorithm
The rules generated.
The quality measures for test of every rule and the global results. This globals results shows the number of rules generated and means results for each quality measure.
Also, the algorithms save those results in the files specified in the output
parameter of the algorithm or
in the outputData
parameter in the parameters file.
Additionally a SDEFSR_Rules
object is returned with this information.
How does this algorithm work?
This algorithm performs a multi-objective genetic algorithm based on elitism (following the SPEA2 approach). The elite population has a fixed size and it is filled by non-dominated individuals.
An individual is non-dominated when (! all(ObjI1 <= ObjI2) & any(ObjI1 < ObjI2))
where ObjI1
is the objective value for our individual and ObjI2 is the objetive value for another individual.
The number of dominated individuals by each one determine, in addition with a niches technique that considers
the proximity among values of the objectives a fitness value for the selection.
The number of non-dominated individuals might be greater or less than elite population size and in those cases MESDIF implements a truncation operator and a fill operator respectively. Then, genetic operators are applied.
At the final of the evolutive process it returns the rules stored in elite population. Therefore, the number of rules is fixed with the eliteLength
parameter.
Parameters file structure
The paramFile
argument points to a file which has the necesary parameters for MESDIF works.
This file must have, at least, those parameters (separated by a carriage return):
-
algorithm
Specify the algorithm to execute. In this case. "MESDIF" -
inputData
Specify two paths of KEEL files for training and test. In case of specify only the name of the file, the path will be the working directory. -
seed
Sets the seed for the random number generator -
nLabels
Sets the number of fuzzy labels to create when reading the files -
nEval
Set the maximun number of evaluations of rules for stop the genetic process -
popLength
Sets number of individuals of the main population -
eliteLength
Sets number of individuals of the elite population. Must be less thanpopLength
-
crossProb
Crossover probability of the genetic algorithm. Value in [0,1] -
mutProb
Mutation probability of the genetic algorithm. Value in [0,1] -
Obj1
Sets the objective number 1. -
Obj2
Sets the objective number 2. -
Obj3
Sets the objective number 3. -
Obj4
Sets the objective number 4. -
RulesRep
Representation of each chromosome of the population. "can" for canonical representation. "dnf" for DNF representation. -
targetVariable
The name or index position of the target variable (or class). It must be a categorical one. -
targetClass
Value of the target variable to search for subgroups. The target variable is always the last variable. Usenull
to search for every value of the target variable
An example of parameter file could be:
algorithm = MESDIF inputData = "irisd-10-1tra.dat" "irisd-10-1tst.dat" outputData = "irisD-10-1-INFO.txt" "irisD-10-1-Rules.txt" "irisD-10-1-TestMeasures.txt" seed = 0 nLabels = 3 nEval = 500 popLength = 100 eliteLength = 3 crossProb = 0.6 mutProb = 0.01 RulesRep = can Obj1 = comp Obj2 = unus Obj3 = null Obj4 = null targetClass = Iris-setosa
@section Objective values: You can use the following quality measures in the ObjX value of the parameter file using this values:
Unusualness ->
unus
Crisp Support ->
csup
Crisp Confidence ->
ccnf
Fuzzy Support ->
fsup
Fuzzy Confidence ->
fcnf
Coverage ->
cove
Significance ->
sign
If you dont want to use a objective value you must specify null
References
Berlanga, F., Del Jesus, M., Gonzalez, P., Herrera, F., & Mesonero, M. (2006). Multiobjective Evolutionary Induction of Subgroup Discovery Fuzzy Rules: A Case Study in Marketing.
Zitzler, E., Laumanns, M., & Thiele, L. (2001). SPEA2: Improving the Strength Pareto Evolutionary Algorithm.
Examples
MESDIF( paramFile = NULL,
training = habermanTra,
test = habermanTst,
output = c(NA, NA, NA),
seed = 0,
nLabels = 3,
nEval = 300,
popLength = 100,
eliteLength = 3,
crossProb = 0.6,
mutProb = 0.01,
RulesRep = "can",
Obj1 = "CSUP",
Obj2 = "CCNF",
Obj3 = "null",
Obj4 = "null",
targetClass = "positive"
)
## Not run:
Execution for all classes, see 'targetClass' parameter
MESDIF( paramFile = NULL,
training = habermanTra,
test = habermanTst,
output = c("optionsFile.txt", "rulesFile.txt", "testQM.txt"),
seed = 0,
nLabels = 3,
nEval = 300,
popLength = 100,
eliteLength = 3,
crossProb = 0.6,
mutProb = 0.01,
RulesRep = "can",
Obj1 = "CSUP",
Obj2 = "CCNF",
Obj3 = "null",
Obj4 = "null",
targetClass = "null"
)
## End(Not run)