copula.sim {copulaSim} | R Documentation |
To generate simulated datasets from empirical data by utilizing the copula invariance property.
Description
Based on the empirical data, generating simulated datasets through the copula invariance property.
Usage
copula.sim(
data.input,
id.vec,
arm.vec,
n.patient,
n.simulation,
seed = NULL,
validation.type = "none",
validation.sig.lvl = 0.05,
rmvnorm.matrix.decomp.method = "svd",
verbose = TRUE
)
Arguments
data.input |
The empirical patient-level data to be used to simulate new virtual patient data. |
id.vec |
The ID for individual patient in the input data. |
arm.vec |
The column to identify the arm in clinical trial. |
n.patient |
The targeted number of patients in each simulated dataset. |
n.simulation |
The number of simulated datasets. |
seed |
The random seed. Default is NULL to use the current seed. |
validation.type |
A string to specify the hypothesis test used to detect the difference between input data and the simulated data. Default is "none". Possible methods are energy distance ("energy") and ball divergence ("ball"). The R packages "energy" and "Ball" are needed. |
validation.sig.lvl |
The significant level (alpha) value for the hypothesis test. |
rmvnorm.matrix.decomp.method |
The method to do the matrix decomposition used in the function |
verbose |
A logical value to specify whether to print message for simulation process or not. |
Value
A copula.sim object with four elements.
data.input: empirical data (wide-form)
data.input.long: empirical data (long-form)
data.transform: quantile transformation of data.input
data.simul: simulated data
Author(s)
Pei-Shan Yen, Xuemin Gu
References
Sklar, A. (1959). Functions de repartition an dimensionset leursmarges., Paris: PublInst Stat.
Nelsen, R. B. (2007). An introduction to copulas. Springer Science & Business Media.
Ross, S. M. (2013). Simulation. Academic Press.
Examples
library(copulaSim)
## Generate Empirical Data
# Assume the 2-arm, 5-dimensional empirical data follows multivariate normal data.
library(mvtnorm)
arm1 <- rmvnorm(n = 40, mean = rep(10, 5), sigma = diag(5) + 0.5)
arm2 <- rmvnorm(n = 40, mean = rep(12, 5), sigma = diag(5) + 0.5)
test_data <- as.data.frame(cbind(1:80, rep(1:2, each = 40), rbind(arm1, arm2)))
colnames(test_data) <- c("id","arm",paste0("time_", 1:5))
## Generate 100 simulated datasets
copula.sim(data.input = test_data[,-c(1,2)], id.vec = test_data$id, arm.vec = test_data$arm,
n.patient = 100 , n.simulation = 100, seed = 2022)