genAlg {bingat}R Documentation

Find Edges Separating Two Groups using Genetic Algorithm (GA)

Description

GA-Mantel is a fully multivariate method that uses a genetic algorithm to search over possible edge subsets using the Mantel correlation as the scoring measure for assessing the quality of any given edge subset.

Usage

	genAlg(data, covars, iters = 50, popSize = 200, earlyStop = 0, 
		dataDist = "manhattan", covarDist = "gower", verbose = FALSE, 
		plot = TRUE, minSolLen = NULL, maxSolLen = NULL)

Arguments

data

A matrix of edges(rows) for each sample(columns).

covars

A matrix of covariates(columns) for each sample(rows).

iters

The number of times to run through the GA.

popSize

The number of solutions to test on each iteration.

earlyStop

The number of consecutive iterations without finding a better solution before stopping regardless of the number of iterations remaining. A value of '0' will prevent early stopping.

dataDist

The distance metric to use for the data. This can only be "manhattan" for now.

covarDist

The distance metric to use for the covariates. Either "euclidean" or "gower".

verbose

While 'TRUE' the current status of the GA will be printed periodically.

plot

A boolean to plot the progress of the scoring statistics by iteration.

minSolLen

The minimum number of columns to select.

maxSolLen

The maximum number of columns to select.

Details

Use a GA approach to find edges that separate subjects based on group membership or set of covariates.

The data and covariates should be normalized BEFORE use with this function because of distance functions.

This function uses modified code from the rbga function in the genalg package. rbga

Because the GA looks at combinations and uses the raw data, edges with a small difference may be selected and large differences may not be.

The distance calculations use the vegdist package. vegdist

Value

A list containing

scoreSumm

A matrix summarizing the score of the population. This can be used to figure out if the ga has come to a final solution or not. This data is also plotted if plot is 'TRUE'.

solutions

The final set of solutions, sorted with the highest scoring first.

scores

The scores for the final set of solutions.

time

How long in seconds the ga took to run.

selected

The selected edges by name.

nonSelected

The edges that were NOT selected by name.

selectedIndex

The selected edges by row number.

Author(s)

Sharina Carter, Elena Deych, Berkley Shands, William D. Shannon

Examples

	## Not run: 
		data(braingraphs)
		
		### Set covars to just be group membership
		covars <- matrix(c(rep(0, 19), rep(1, 19)))
		
		### We use low numbers for speed. The exact numbers to use depend
		### on the data being used, but generally the higher iters and popSize 
		### the longer it will take to run.  earlyStop is then used to stop the
		### run early if the results aren't improving.
		iters <- 500
		popSize <- 200
		earlyStop <- 250
		
		gaRes <- genAlg(braingraphs, covars, iters, popSize, earlyStop)
	
## End(Not run)

[Package bingat version 1.3 Index]