R: Simulate a dataset with correlated measures

sim.cor {NCmisc}

R Documentation

Simulate a dataset with correlated measures

Description

Simulate a dataset with correlated measures (normal simulation with e.g, rnorm() usually only gives small randomly distributed correlations between variables). This is a quick and unsophisticated method, but should be able to provide a dataset with slightly more realistic structure than simple rnorm() type functions. Varying the last three parameters gives some control on the way the data is generated. It starts with a seed random variable, then creates 'k' random variables with an expected correlation of r=genr() with that seed variable. Then after this, one of the variables in the set (including the seed) is randomly selected to run through the same process of generating 'k' new variables; this is repeated until columns are full up. 'mix.order' then randomizes the column order destroying the relationship between column number and correlation structure, although in some cases, such relationships might be desired as representative of some real life datasets.

Usage

sim.cor(
  nrow = 100,
  ncol = 100,
  genx = rnorm,
  genr = runif,
  k = 3,
  mix.order = TRUE
)

Arguments

`nrow`	integer, number of rows to simulate
`ncol`	integer, number of columns to simulate
`genx`	the generating function for data, e.g rnorm(), runif(), etc
`genr`	the generating function for desired correlation, e.g, runif()
`k`	number of steps generating from the same seed before choosing a new seed
`mix.order`	whether to randomize the column order after simulating

Author(s)

Nicholas Cooper

Examples

corDat <- sim.cor(200,5)
prv(corDat) # preview of simulated normal data with r uniformly varying
cor(corDat) # correlation matrix
corDat <- sim.cor(500,4,genx=runif,genr=function(x) { 0.5 },mix.order=FALSE)
prv(corDat) # preview of simulated uniform data with r fixed at 0.5
cor(corDat) # correlation matrix