copula.sim {copulaSim}R Documentation

To generate simulated datasets from empirical data by utilizing the copula invariance property.

Description

Based on the empirical data, generating simulated datasets through the copula invariance property.

Usage

copula.sim(
  data.input,
  id.vec,
  arm.vec,
  n.patient,
  n.simulation,
  seed = NULL,
  validation.type = "none",
  validation.sig.lvl = 0.05,
  rmvnorm.matrix.decomp.method = "svd",
  verbose = TRUE
)

Arguments

data.input

The empirical patient-level data to be used to simulate new virtual patient data.

id.vec

The ID for individual patient in the input data.

arm.vec

The column to identify the arm in clinical trial.

n.patient

The targeted number of patients in each simulated dataset.

n.simulation

The number of simulated datasets.

seed

The random seed. Default is NULL to use the current seed.

validation.type

A string to specify the hypothesis test used to detect the difference between input data and the simulated data. Default is "none". Possible methods are energy distance ("energy") and ball divergence ("ball"). The R packages "energy" and "Ball" are needed.

validation.sig.lvl

The significant level (alpha) value for the hypothesis test.

rmvnorm.matrix.decomp.method

The method to do the matrix decomposition used in the function rmvnorm. Default is "svd".

verbose

A logical value to specify whether to print message for simulation process or not.

Value

A copula.sim object with four elements.

  1. data.input: empirical data (wide-form)

  2. data.input.long: empirical data (long-form)

  3. data.transform: quantile transformation of data.input

  4. data.simul: simulated data

Author(s)

Pei-Shan Yen, Xuemin Gu

References

Sklar, A. (1959). Functions de repartition an dimensionset leursmarges., Paris: PublInst Stat.

Nelsen, R. B. (2007). An introduction to copulas. Springer Science & Business Media.

Ross, S. M. (2013). Simulation. Academic Press.

Examples


library(copulaSim)

## Generate Empirical Data
 # Assume the 2-arm, 5-dimensional empirical data follows multivariate normal data.
library(mvtnorm)
arm1 <- rmvnorm(n = 40, mean  = rep(10, 5), sigma = diag(5) + 0.5)
arm2 <- rmvnorm(n = 40, mean  = rep(12, 5), sigma = diag(5) + 0.5)
test_data <- as.data.frame(cbind(1:80, rep(1:2, each = 40), rbind(arm1, arm2)))
colnames(test_data) <- c("id","arm",paste0("time_", 1:5))

## Generate 100 simulated datasets
copula.sim(data.input = test_data[,-c(1,2)], id.vec = test_data$id, arm.vec = test_data$arm,
n.patient = 100 , n.simulation = 100, seed = 2022)

[Package copulaSim version 0.0.1 Index]