cv.stepreg {glmnetr}R Documentation

Cross validation informed stepwise regression model fit.

Description

Cross validation informed stepwise regression model fit.

Usage

cv.stepreg(
  xs_cv,
  start_cv = NULL,
  y_cv,
  event_cv,
  family = "cox",
  steps_n = 0,
  folds_n = 10,
  method = "loglik",
  seed = NULL,
  foldid = NULL,
  stratified = 1,
  track = 0
)

Arguments

xs_cv

predictor input - an n by p matrix, where n (rows) is sample size, and p (columns) the number of predictors. Must be in matrix form for complete data, no NA's, no Inf's, etc., and not a data frame.

start_cv

start time, Cox model only - class numeric of length same as number of patients (n)

y_cv

output vector: time, or stop time for Cox model, Y_ 0 or 1 for binomal (logistic), numeric for gaussian. #' Must be a vector of length same as number of sample size.

event_cv

event indicator, 1 for event, 0 for census, Cox model only. Must be a numeric vector of length same as sample size.

family

model family, "cox", "binomial" or "gaussian"

steps_n

Maximun number of steps done in stepwise regression fitting. If 0, then takes the value rank(xs_cv).

folds_n

number of folds for cross validation

method

method for choosing model in stepwise procedure, "loglik" or "concordance". Other procedures use the "loglik".

seed

a seed for set.seed() to assure one can get the same results twice. If NULL the program will generate a random seed. Whether specified or NULL, the seed is stored in the output object for future reference.

foldid

a vector of integers to associate each record to a fold. The integers should be between 1 and folds_n.

stratified

folds are to be constructed stratified on an indicator outcome 1 (default) for yes, 0 for no. Pertains to event variable for "cox" and y_ for "binomial" family.

track

indicate whether or not to update progress in the console. Default of 0 suppresses these updates. The option of 1 provides these updates. In fitting clinical data with non full rank design matrix we have found some R-packages to take a very long time. Therefore we allow the user to track the program progress and judge whether things are moving forward or if the process should be stopped.

Value

cross validation infomred stepwise regression model fit tuned by number of model terms or p-value for inclusion.

See Also

predict.cv.stepreg , summary.cv.stepreg, stepreg , aicreg , nested.glmnetr

Examples

set.seed(955702213)
sim.data=glmnetr.simdata(nrows=1000, ncols=100, beta=c(0,1,1))
# this gives a more interesting case but takes longer to run
xs=sim.data$xs           
# this will work numerically as an example 
xs=sim.data$xs[,c(2,3,50:55)] 
dim(xs)
y_=sim.data$yt 
event=sim.data$event
# for this example we use small numbers for steps_n and folds_n to shorten run time 
cv.stepreg.fit = cv.stepreg(xs, NULL, y_, event, steps_n=10, folds_n=3, track=0)
summary(cv.stepreg.fit)


[Package glmnetr version 0.5-2 Index]