GenAlg {GenAlgo}R Documentation

A generic Genetic Algorithm for feature selection

Description

These functions allow you to initialize (GenAlg) and iterate (newGeneration) a genetic algorithm to perform feature selection for binary class prediction in the context of gene expression microarrays or other high-throughput technologies.

Usage

GenAlg(data, fitfun, mutfun, context, pm=0.001, pc=0.5, gen=1)
newGeneration(ga)
popDiversity(ga)

Arguments

data

The initial population of potential solutions, in the form of a data matrix with one individual per row.

fitfun

A function to compute the fitness of an individual solution. Must take two input arguments: a vector of indices into rows of the population matrix, and a context list within which any other items required by the function can be resolved. Must return a real number; higher values indicate better fitness, with the maximum fitness occurring at the optimal solution to the underlying numerical problem.

mutfun

A function to mutate individual alleles in the population. Must take two arguments: the starting allele and a context list as in the fitness function.

context

A list of additional data required to perform mutation or to compute fitness. This list is passed along as the second argument when fitfun and mutfun are called.

pm

A real value between 0 and 1, representing the probability that an individual allele will be mutated.

pc

A real value between 0 and 1, representing the probability that crossover will occur during reproduction.

gen

An integer identifying the current generation.

ga

An object of class GenAlg

Value

Both the GenAlg generator and the newGeneration functions return a GenAlg-class object. The popDiversity function returns a real number representing the average diversity of the population. Here diversity is defined by the number of alleles (selected features) that differ in two individuals.

Author(s)

Kevin R. Coombes krc@silicovore.com, P. Roebuck proebuck@mdanderson.org

See Also

GenAlg-class, GenAlg-tools, maha.

Examples

# generate some fake data
nFeatures <- 1000
nSamples <- 50
fakeData <- matrix(rnorm(nFeatures*nSamples), nrow=nFeatures, ncol=nSamples)
fakeGroups <- sample(c(0,1), nSamples, replace=TRUE)
myContext <- list(dataset=fakeData, gps=fakeGroups)

# initialize population
n.individuals <- 200
n.features <- 9
y <- matrix(0, n.individuals, n.features)
for (i in 1:n.individuals) {
  y[i,] <- sample(1:nrow(fakeData), n.features)
}

# set up the genetic algorithm
my.ga <- GenAlg(y, selectionFitness, selectionMutate, myContext, 0.001, 0.75)

# advance one generation
my.ga <- newGeneration(my.ga)


[Package GenAlgo version 2.2.0 Index]