model.formulas.update {CICI}R Documentation

Update model formulas based on variable screening

Description

Wrapper function to facilitate variable screening on all models generated through make.model.formulas and return updated formulas in the appropriate format for gformula.

Usage

model.formulas.update(formulas, X, screening = screen.glmnet.cramer,
                      with.s = FALSE, by= NA, ...)

Arguments

formulas

A named list of length 4 containing model formulas for all Y-/L-/A- and Cnodes. These are likely formulas returned from make.model.formulas.

X

A data frame on which the model formulas are to be evaluated.

screening

A screening function. Default is screen.glmnet.cramer, see Details below.

with.s

Logical. If TRUE, a spline, i.e. s(), will be added to all continuous variables.

by

A character vector specifying the variables with which to multiply the smooth (if with.s=TRUE).

...

optional arguments to be passed to the screening algorithm

Details

The default screening algorithm uses LASSO for variable screening (and Cramer's V for the categorized version of all variables if LASSO fails). It is possible to provide user-specific screening algorithms. User-specific algorithms should take the data as first argument, one model formula (i.e. one entry of the list in model.formulas) as second argument and return a vector of strings, containing the variable names that remain after screening. Another screening algorithm available in the package is screen.cramersv, which categorizes all variables, calculates their association with the outcome based on Cramer's V and selects the 4 variables with strongest associations (can be changed with option nscreen). The manual provides more information.

The fitted models of the updated models can be evaluated with fit.updated.formulas.

Value

A list of length 4 containing the updated model formulas:

Lnames

A vector of strings containing updated model formulas for all L nodes.

Ynames

A vector of strings containing updated model formulas for all Y nodes.

Anames

A vector of strings containing updated model formulas for all A nodes.

Cnames

A vector of strings containing updated model formulas for all C nodes.

See Also

make.model.formulas, model.update, fit.updated.formulas

Examples


data(EFV)

# first: generate generic model formulas
m <- make.model.formulas(X=EFV,
                         Lnodes  = c("adherence.1","weight.1",
                                     "adherence.2","weight.2",
                                     "adherence.3","weight.3",
                                     "adherence.4","weight.4"
                                    ),
                         Ynodes  = c("VL.0","VL.1","VL.2","VL.3","VL.4"),
                         Anodes  = c("efv.0","efv.1","efv.2","efv.3","efv.4"),
                         evaluate=FALSE) 
                         
# second: update these model formulas based on variable screening with LASSO
glmnet.formulas <-  model.formulas.update(m$model.names, EFV)
glmnet.formulas 


# third: use these models for estimation
est <- gformula(X=EFV,
                Lnodes  = c("adherence.1","weight.1",
                            "adherence.2","weight.2",
                            "adherence.3","weight.3",
                            "adherence.4","weight.4"
                ),
                Ynodes  = c("VL.0","VL.1","VL.2","VL.3","VL.4"),
                Anodes  = c("efv.0","efv.1","efv.2","efv.3","efv.4"),
                Yform=glmnet.formulas$Ynames, Lform=glmnet.formulas$Lnames,
                abar=seq(0,2,1)
)
est


[Package CICI version 0.9.1 Index]