gen.data {bestridge} | R Documentation |

Generate data for simulations under the generalized linear model and Cox model.

gen.data( n, p, k = NULL, rho = 0, family = c("gaussian", "binomial", "poisson", "cox"), beta = NULL, cortype = 1, snr = 10, censoring = TRUE, c = 1, scal, sigma = 1, seed = 1 )

`n` |
The number of observations. |

`p` |
The number of predictors of interest. |

`k` |
The number of nonzero coefficients in the underlying regression
model. Can be omitted if |

`rho` |
A parameter used to characterize the pairwise correlation in
predictors. Default is |

`family` |
The distribution of the simulated data. |

`beta` |
The coefficient values in the underlying regression model. |

`cortype` |
The correlation structure. |

`snr` |
A numerical value controlling the signal-to-noise ratio (SNR). The SNR is defined as
as the variance of |

`censoring` |
Whether data is censored or not. Valid only for |

`c` |
The censoring rate. Default is |

`scal` |
A parameter in generating survival time based on the Weibull distribution. Only used for the " |

`sigma` |
A parameter used to control the signal-to-noise ratio. For linear regression,
it is the error variance |

`seed` |
seed to be used in generating the random numbers. |

We generate an *n \times p* random Gaussian matrix
*X* with mean 0 and a covariance matrix with an exponential structure
or a constant structure. For the exponential structure, the covariance matrix
has *(i,j)* entry equals *rho^{|i-j|}*. For the constant structure,
the *(i,j)* entry of the covariance matrix is *rho* for every *i
\neq j* and 1 elsewhere. For the moving average structure, For the design matrix *X*,
we first generate an *n \times p* random Gaussian matrix *\bar{X}*
whose entries are i.i.d. *\sim N(0,1)* and then normalize its columns
to the *√ n* length. Then the design matrix *X* is generated with
*X_j = \bar{X}_j + ρ(\bar{X}_{j+1}+\bar{X}_{j-1})* for *j=2,…,p-1*.

For `family = "gaussian"`

, the data model is

*Y = X β +
ε.*

The underlying regression coefficient *β* has uniform distribution [m, 100m], *m=5 √{2log(p)/n}.*

For `family= "binomial"`

, the data model is

*Prob(Y = 1) = \exp(X
β + ε)/(1 + \exp(X β + ε)).*

The underlying regression coefficient *β* has uniform distribution [2m, 10m], *m = 5σ √{2log(p)/n}.*

For `family = "poisson"`

, the data is modeled to have an exponential distribution:

*Y = Exp(\exp(X β +
ε)).*

For `family = "cox"`

, the data model is

*T = (-\log(S(t))/\exp(X β))^{1/scal}.*

The centering time is generated from uniform distribution *[0, c]*,
then we define the censor status as *δ = I\{T ≤q C\}, R = min\{T, C\}*.
The underlying regression coefficient *β* has uniform distribution [2m, 10m], *m = 5σ √{2log(p)/n}.*
In the above models, *ε \sim N(0,
σ^2 ),* where *σ^2* is determined by the `snr`

.

`x` |
Design matrix of predictors. |

`y` |
Response variable. |

`Tbeta` |
The coefficients used in the underlying regression model. |

Liyuan Hu, Kangkang Jiang, Yanhang Zhang, Jin Zhu, Canhong Wen and Xueqin Wang.

# Generate simulated data n <- 200 p <- 20 k <- 5 rho <- 0.4 SNR <- 10 cortype <- 1 seed <- 10 Data <- gen.data(n, p, k, rho, family = "gaussian", cortype = cortype, snr = SNR, seed = seed) x <- Data$x[1:140, ] y <- Data$y[1:140] x_new <- Data$x[141:200, ] y_new <- Data$y[141:200] lambda.list <- exp(seq(log(5), log(0.1), length.out = 10)) lm.bsrr <- bsrr(x, y, method = "pgsection")

[Package *bestridge* version 1.0.5 Index]