GenSyntheticHighCorr {L0Learn} | R Documentation |
Generate Exponential Correlated Synthetic Data
Description
Generates a synthetic dataset as follows: 1) Generate a correlation matrix, SIG, where item [i, j] = A^|i-j|. 2) Draw from a Multivariate Normal Distribution using (mu and SIG) to generate X. 3) Generate a vector B with every ~p/k entry set to 1 and the rest are zeros. 4) Sample every element in the noise vector e from N(0,1). 4) Set y = XB + b0 + e.
Usage
GenSyntheticHighCorr(
n,
p,
k,
seed,
rho = 0,
b0 = 0,
snr = 1,
mu = 0,
base_cor = 0.9
)
Arguments
n |
Number of samples |
p |
Number of features |
k |
Number of non-zeros in true vector of coefficients |
seed |
The seed used for randomly generating the data |
rho |
The threshold for setting values to 0. if |X(i, j)| > rho => X(i, j) <- 0 |
b0 |
intercept value to scale y by. |
snr |
desired Signal-to-Noise ratio. This sets the magnitude of the error term 'e'. SNR is defined as SNR = Var(XB)/Var(e) |
mu |
The mean for drawing from the Multivariate Normal Distribution. A scalar of vector of length p. |
base_cor |
The base correlation, A in [i, j] = A^|i-j|. |
Value
A list containing: the data matrix X, the response vector y, the coefficients B, the error vector e, the intercept term b0.