gen_toy_data {sgs} | R Documentation |
Generate toy data.
Description
Generates different types of datasets, which can then be fitted using sparse-group SLOPE.
Usage
gen_toy_data(
p,
n,
rho = 0,
seed_id = 2,
grouped = TRUE,
groups,
noise_level = 1,
group_sparsity = 0.1,
var_sparsity = 0.5,
orthogonal = FALSE,
data_mean = 0,
data_sd = 1,
signal_mean = 0,
signal_sd = sqrt(10)
)
Arguments
p |
The number of input variables. |
n |
The number of observations. |
rho |
Correlation coefficient. Must be in range |
seed_id |
Seed to be used to generate the data matrix |
grouped |
A logical flag indicating whether grouped data is required. |
groups |
If |
noise_level |
Defines the level of noise ( |
group_sparsity |
Defines the level of group sparsity. Must be in the range |
var_sparsity |
Defines the level of variable sparsity. Must be in the range |
orthogonal |
Logical flag as to whether the input matrix should be orthogonal. |
data_mean |
Defines the mean of input predictors. |
data_sd |
Defines the standard deviation of the signal ( |
signal_mean |
Defines the mean of the signal ( |
signal_sd |
Defines the standard deviation of the signal ( |
Details
The data is generated under a Gaussian linear model. The generated data can be grouped and sparsity can be provided at both a group and/or variable level.
Value
A list containing:
y |
The response vector. |
X |
The input matrix. |
true_beta |
The true values of |
true_grp_id |
Indices of which groups are non-zero in |
Examples
# specify a grouping structure
groups = c(rep(1:20, each=3),
rep(21:40, each=4),
rep(41:60, each=5),
rep(61:80, each=6),
rep(81:100, each=7))
# generate data
data = gen_toy_data(p=500, n=400, groups = groups, seed_id=3)