addCorFlex {simstudy} | R Documentation |
Create multivariate (correlated) data - for general distributions
Description
Create multivariate (correlated) data - for general distributions
Usage
addCorFlex(
dt,
defs,
rho = 0,
tau = NULL,
corstr = "cs",
corMatrix = NULL,
envir = parent.frame()
)
Arguments
dt |
Data table that will be updated. |
defs |
Field definition table created by function |
rho |
Correlation coefficient, -1 <= rho <= 1. Use if corMatrix is not provided. |
tau |
Correlation based on Kendall's tau. If tau is specified, then it is used as the correlation even if rho is specified. If tau is NULL, then the specified value of rho is used, or rho defaults to 0. |
corstr |
Correlation structure of the variance-covariance matrix defined by sigma and rho. Options include "cs" for a compound symmetry structure and "ar1" for an autoregressive structure. Defaults to "cs". |
corMatrix |
Correlation matrix can be entered directly. It must be symmetrical and positive semi-definite. It is not a required field; if a matrix is not provided, then a structure and correlation coefficient rho must be specified. |
envir |
Environment the data definitions are evaluated in. Defaults to base::parent.frame. |
Value
data.table with added column(s) of correlated data
Examples
defC <- defData(
varname = "nInds", formula = 50, dist = "noZeroPoisson",
id = "idClust"
)
dc <- genData(10, defC)
#### Normal only
dc <- addCorData(dc,
mu = c(0, 0, 0, 0), sigma = c(2, 2, 2, 2), rho = .2,
corstr = "cs", cnames = c("a", "b", "c", "d"),
idname = "idClust"
)
di <- genCluster(dc, "idClust", "nInds", "id")
defI <- defDataAdd(
varname = "A", formula = "-1 + a", variance = 3,
dist = "normal"
)
defI <- defDataAdd(defI,
varname = "B", formula = "4.5 + b", variance = .5,
dist = "normal"
)
defI <- defDataAdd(defI,
varname = "C", formula = "5*c", variance = 3,
dist = "normal"
)
defI <- defDataAdd(defI,
varname = "D", formula = "1.6 + d", variance = 1,
dist = "normal"
)
#### Generate new data
di <- addCorFlex(di, defI, rho = 0.4, corstr = "cs")
# Check correlations by cluster
for (i in 1:nrow(dc)) {
print(cor(di[idClust == i, list(A, B, C, D)]))
}
# Check global correlations - should not be as correlated
cor(di[, list(A, B, C, D)])