glmnetr.simdata {glmnetr} | R Documentation |
Generate example data
Description
Generate an example data set with specified number of observations, and predictors. The first column in the design matrix is identically equal to 1 for an intercept. Columns 2 to 5 are for the 4 levels of a character variable, 6 to 11 for the 6 levels of another character variable. Columns 12 to 17 are for 3 binomial predictors, again over parameterized. Such over parameterization can cause difficulties with the glmnet() of the 'glmnet' package.
Usage
glmnetr.simdata(nrows = 1000, ncols = 100, beta = NULL, intr = NULL)
Arguments
nrows |
Sample size (>=100) for simulated data, default=1000. |
ncols |
Number of columns (>=17) in design matrix, i.e. predictors, default=100. |
beta |
Vector of length <= ncols for "left most" coefficients. If beta has length < ncols, then the values at length(beta)+1 to ncols are set to 0. Default=NULL, where a beta of length 25 is assigned standard normal values. |
intr |
either NULL for no interactions or a vector of length 3 to impose a product effect as decribed by intr[1]*xs[,3]*xs[,8] + intr[2]*xs[,4]*xs[,16] + intr[3]*xs[,18]*xs[,19] + intr[4]*xs[,21]*xs[,22] |
Value
A list with elements xs for desing matrix, y_ for a quantitative outcome, yt for a survival time, event for an indicator of event (1) or censoring (0), in the Cox proportional hazards survival model setting, yb for yes/no (binomial) outcome data, and beta the beta used in random number generation.
See Also
Examples
sim.data=glmnetr.simdata(nrows=1000, ncols=100, beta=NULL)
# for Cox PH survial model data
xs=sim.data$xs
y_=sim.data$yt
event=sim.data$event
# for linear regression model data
xs=sim.data$xs
y_=sim.data$y_
# for logistic regression model data
xs=sim.data$xs
y_=sim.data$yb