NMEEF_SD {SDEFSR}R Documentation

Non-dominated Multi-objective Evolutionary algorithm for Extracting Fuzzy rules in Subgroup Discovery (NMEEF-SD)

Description

Perfoms a subgroup discovery task executing the algorithm NMEEF-SD

Usage

NMEEF_SD(
  paramFile = NULL,
  training = NULL,
  test = NULL,
  output = c("optionsFile.txt", "rulesFile.txt", "testQM.txt"),
  seed = 0,
  nLabels = 3,
  nEval = 10000,
  popLength = 100,
  mutProb = 0.1,
  crossProb = 0.6,
  Obj1 = "CSUP",
  Obj2 = "CCNF",
  Obj3 = "null",
  minCnf = 0.6,
  reInitCoverage = "yes",
  porcCob = 0.5,
  StrictDominance = "yes",
  targetVariable = NA,
  targetClass = "null"
)

Arguments

paramFile

The path of the parameters file. NULL If you want to use training and test SDEFSR_Dataset variables

training

A SDEFSR_Dataset class variable with training data.

test

A SDEFSR_Dataset class variable with training data.

output

character vector with the paths of where store information file, rules file and test quality measures file, respectively.

seed

An integer to set the seed used for generate random numbers.

nLabels

Number of linguistic labels for numerical variables.

nEval

An integer for set the maximum number of evaluations in the evolutionary process.

popLength

An integer to set the number of individuals in the population.

mutProb

Sets the mutation probability. A number in [0,1].

crossProb

Sets the crossover probability. A number in [0,1].

Obj1

Sets the Objective number 1. See Objective values for more information about the possible values.

Obj2

Sets the Objective number 2. See Objective values for more information about the possible values.

Obj3

Sets the Objective number 3. See Objective values for more information about the possible values.

minCnf

Sets the minimum confidence that must have a rule in the Pareto front for being returned. A number in [0,1].

reInitCoverage

Sets if the algorithm must perform the reinitialitation based on coverage when it is needed. A string with "yes" or "no".

porcCob

Sets the maximum percentage of variables that participate in the rules generated in the reinitialitation based on coverage. A number in [0,1]

StrictDominance

Sets if the comparison between individuals must be done by strict dominance or not. A string with "yes" or "no".

targetVariable

The name or index position of the target variable (or class). It must be a categorical one.

targetClass

A string specifing the value the target variable. null for search for all possible values.

Details

This function sets as target variable the last one that appear in SDEFSR_Dataset object. If you want to change the target variable, you can set the targetVariable to change this target variable. The target variable MUST be categorical, if it is not, throws an error. Also, the default behaviour is to find rules for all possible values of the target varaible. targetClass sets a value of the target variable where the algorithm only finds rules about this value.

If you specify in paramFile something distinct to NULL the rest of the parameters are ignored and the algorithm tries to read the file specified. See "Parameters file structure" below if you want to use a parameters file.

Value

The algorithm shows in the console the following results:

  1. The parameters used in the algorithm

  2. The rules generated.

  3. The quality measures for test of every rule and the global results.

    Also, the algorithms save those results in the files specified in the output parameter of the algorithm or in the outputData parameter in the parameters file.

How does this algorithm work?

NMEEF-SD is a multiobjetctive genetic algorithm based on a NSGA-II approach. The algorithm first makes a selection based on binary tournament and save the individuals in a offspring population. Then, NMEEF-SD apply the genetic operators over individuals in offspring population

For generate the population which participate in the next iteration of the evolutionary process NMEEF-SD calculate the dominance among all individuals (join main population and offspring) and then, apply the NSGA-II fast sort algorithm to order the population by fronts of dominance, the first front is the non-dominated front (or Pareto), the second is where the individuals dominated by one individual are, the thirt front dominated by two and so on.

To promove diversity NMEEF-SD has a mechanism of reinitialization of the population based on coverage if the Pareto doesnt evolve during a 5

At the final of the evolutionary process, the algorithm returns only the individuals in the Pareto front which has a confidence greater than a minimum confidence level.

Parameters file structure

The paramFile argument points to a file which has the necesary parameters for NMEEF-SD works. This file must be, at least, those parameters (separated by a carriage return):

An example of parameter file could be:

algorithm = NMEEFSD
inputData = "irisd-10-1tra.dat" "irisd-10-1tra.dat" "irisD-10-1tst.dat"
outputData = "irisD-10-1-INFO.txt" "irisD-10-1-Rules.txt" "irisD-10-1-TestMeasures.txt"
seed = 1
RulesRep = can
nLabels = 3
nEval = 500
popLength = 51
crossProb = 0.6
mutProb = 0.1
ReInitCob = yes
porcCob = 0.5
Obj1 = comp
Obj2 = unus
Obj3 = null
minCnf = 0.6
StrictDominance = yes
targetClass = Iris-setosa

Objective values

You can use the following quality measures in the ObjX value of the parameter file using this values:

If you dont want to use a objetive value you must specify null

References

Carmona, C., Gonzalez, P., del Jesus, M., & Herrera, F. (2010). NMEEF-SD: Non-dominated Multi-objective Evolutionary algorithm for Extracting Fuzzy rules in Subgroup Discovery.

Examples

 
   NMEEF_SD(paramFile = NULL, 
               training = habermanTra, 
               test = habermanTst, 
               output = c(NA, NA, NA),
               seed = 0, 
               nLabels = 3,
               nEval = 300, 
               popLength = 100, 
               mutProb = 0.1,
               crossProb = 0.6,
               Obj1 = "CSUP",
               Obj2 = "CCNF",
               Obj3 = "null",
               minCnf = 0.6,
               reInitCoverage = "yes",
               porcCob = 0.5,
               StrictDominance = "yes",
               targetClass = "positive"
               )  
## Not run: 
      NMEEF_SD(paramFile = NULL, 
               training = habermanTra, 
               test = habermanTst, 
               output = c("optionsFile.txt", "rulesFile.txt", "testQM.txt"),
               seed = 0, 
               nLabels = 3,
               nEval = 300, 
               popLength = 100, 
               mutProb = 0.1,
               crossProb = 0.6,
               Obj1 = "CSUP",
               Obj2 = "CCNF",
               Obj3 = "null",
               minCnf = 0.6,
               reInitCoverage = "yes",
               porcCob = 0.5,
               StrictDominance = "yes",
               targetClass = "null"
               )
     
## End(Not run)

[Package SDEFSR version 0.7.22 Index]