datagenerator {metafuse} | R Documentation |
simulate data
Description
Simulate a dataset with data from K
different sources, for demonstration of metafuse
.
Usage
datagenerator(n, beta0, family, seed = NA)
Arguments
n |
a vector of length |
beta0 |
a coefficient matrix of dimension |
family |
the type of the response vector, |
seed |
the random seed for data generation, default is |
Details
These datasets are artifical, and are used to demonstrate the features of metafuse
. In the case when family="cox"
, the response will contain two vectors, a time-to-event variable time
and a censoring indicator status
.
Value
Returns data frame with n*K
rows (if n
is a scalar), or sum(n)
rows (if n
is a K
-element vector). The data frame contains columns "y", "x1", ..., "x_p-1" and "group" if family="gaussian"
, "binomial"
or "poisson"
; or contains columns "time", "status", "x1", ..., "x_p-1" and "group" if family="cox"
.
Examples
########### generate data ###########
n <- 200 # sample size in each dataset (can also be a K-element vector)
K <- 10 # number of datasets for data integration
p <- 3 # number of covariates in X (including the intercept)
# the coefficient matrix of dimension K * p, used to specify the heterogeneous pattern
beta0 <- matrix(c(0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0, # beta_0 of intercept
0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0, # beta_1 of X_1
0.0,0.0,0.0,0.0,0.5,0.5,0.5,1.0,1.0,1.0), # beta_2 of X_2
K, p)
# generate a data set, family=c("gaussian", "binomial", "poisson", "cox")
data <- datagenerator(n=n, beta0=beta0, family="gaussian", seed=123)
names(data)
# if family="cox", returned dataset contains columns "time"" and "status" instead of "y"
data <- datagenerator(n=n, beta0=beta0, family="cox", seed=123)
names(data)