simulate_tlm {SeBR} | R Documentation |
Simulate a transformed linear model
Description
Generate training data (X, y) and testing data (X_test, y_test) for a transformed linear model. The covariates are correlated Gaussian variables. Half of the true regression coefficients are zero and the other half are one. There are multiple options for the transformation, which define the support of the data (see below).
Usage
simulate_tlm(
n,
p,
g_type = "beta",
n_test = 1000,
heterosked = FALSE,
lambda = 1
)
Arguments
n |
number of observations in the training data |
p |
number of covariates |
g_type |
type of transformation; must be one of
|
n_test |
number of observations in the testing data |
heterosked |
logical; if TRUE, simulate the latent data with heteroskedasticity |
lambda |
Box-Cox parameter (only applies for |
Details
The transformations vary in complexity and support
for the observed data, and include the following options:
beta
yields marginally Beta(0.1, 0.5) data
supported on [0,1]; step
generates a locally-linear
inverse transformation and produces positive data; and box-cox
refers to the signed Box-Cox family indexed by lambda
,
which generates real-valued data with examples including identity,
square-root, and log transformations.
Value
a list with the following elements:
-
y
: the response variable in the training data -
X
: the covariates in the training data -
y_test
: the response variable in the testing data -
X_test
: the covariates in the testing data -
beta_true
: the true regression coefficients -
g_true
: the true transformation, evaluated at y
Examples
# Simulate data:
dat = simulate_tlm(n = 100, p = 5, g_type = 'beta')
names(dat) # what is returned
hist(dat$y, breaks = 25) # marginal distribution