R: Genetic algorithm for generating correlated binary data

iter_matrix {RepeatedHighDim}

R Documentation

Genetic algorithm for generating correlated binary data

Description

Starts the genetic algorithm based on a start matrix with specified marginal probabilities.

Usage

iter_matrix(X0, R, T = 1000, e.min = 1e-04, plt = TRUE, perc = TRUE)

Arguments

`X0`	Start matrix with specified marginal probabilities. Can be generated by `start_matrix`.
`R`	Desired correlation matrix the data should have after running the genetic algorithm.
`T`	Maximum number of iterations after which the genetic algorithm stops.
`e.min`	Minimum error (RMSE) between the correlation of the iterated data matrix and R.
`plt`	Boolean parameter that indicates whether to plot e.min versus the iteration step.
`perc`	Boolean parameter that indicates whether to print the percentage of iteration steps relativ to T.

Details

In each step, the genetic algorithm swaps two randomly selected entries in each column of X0. Thus it can be guaranteed that the marginal probabilities do not change. If the correlation matrix is closer to R than that of x0(t-1), X0(t) replaces X0(t-1).

Value

A list with four entries:

Xt: Final representativ data matrix with specified marginal probabilities and a correlation as close as possible to R
t: Number of performed iteration steps (t <= T)
Rt: Empirical correlation matrix of Xt
RMSE: Final RSME error between desired and achieved correlation matrix

Author(s)

Jochen Kruppa, Klaus Jung

References

Kruppa, J., Lepenies, B., & Jung, K. (2018). A genetic algorithm for simulating correlated binary data from biomedical research. Computers in biology and medicine, 92, 1-8. doi:10.1016/j.compbiomed.2017.10.023

Examples

### Generation of the representive matrix Xt
X0 <- start_matrix(p = c(0.5, 0.6), k = 1000)
Xt <- iter_matrix(X0, R = diag(2), T = 10000, e.min = 0.00001)$Xt

### Drawing of a random sample S of size n = 10
S <- Xt[sample(1:1000, 10, replace = TRUE),]

[Package RepeatedHighDim version 2.3.0 Index]