simulate_tlm {SeBR}R Documentation

Simulate a transformed linear model

Description

Generate training data (X, y) and testing data (X_test, y_test) for a transformed linear model. The covariates are correlated Gaussian variables. Half of the true regression coefficients are zero and the other half are one. There are multiple options for the transformation, which define the support of the data (see below).

Usage

simulate_tlm(
  n,
  p,
  g_type = "beta",
  n_test = 1000,
  heterosked = FALSE,
  lambda = 1
)

Arguments

n

number of observations in the training data

p

number of covariates

g_type

type of transformation; must be one of beta, step, or box-cox

n_test

number of observations in the testing data

heterosked

logical; if TRUE, simulate the latent data with heteroskedasticity

lambda

Box-Cox parameter (only applies for g_type = 'box-cox')

Details

The transformations vary in complexity and support for the observed data, and include the following options: beta yields marginally Beta(0.1, 0.5) data supported on [0,1]; step generates a locally-linear inverse transformation and produces positive data; and box-cox refers to the signed Box-Cox family indexed by lambda, which generates real-valued data with examples including identity, square-root, and log transformations.

Value

a list with the following elements:

Examples

# Simulate data:
dat = simulate_tlm(n = 100, p = 5, g_type = 'beta')
names(dat) # what is returned
hist(dat$y, breaks = 25) # marginal distribution


[Package SeBR version 1.0.0 Index]