| model.formulas.update {CICI} | R Documentation |
Update model formulas based on variable screening
Description
Wrapper function to facilitate variable screening on all models generated through make.model.formulas and return updated formulas in the appropriate format for gformula.
Usage
model.formulas.update(formulas, X, screening = screen.glmnet.cramer,
with.s = FALSE, by= NA, ...)
Arguments
formulas |
A named list of length 4 containing model formulas for all Y-/L-/A- and Cnodes. These are likely formulas returned from |
X |
A data frame on which the model formulas are to be evaluated. |
screening |
A screening function. Default is |
with.s |
Logical. If TRUE, a spline, i.e. s(), will be added to all continuous variables. |
by |
A character vector specifying the variables with which to multiply the smooth (if |
... |
optional arguments to be passed to the screening algorithm |
Details
The default screening algorithm uses LASSO for variable screening (and Cramer's V for the categorized version of all variables if LASSO fails). It is possible to provide user-specific screening algorithms.
User-specific algorithms should take the data as first argument, one model formula (i.e. one entry of the list in model.formulas) as second argument and return a vector of strings, containing the variable names that remain after screening. Another screening algorithm available in the package is screen.cramersv, which categorizes all variables, calculates their association with the outcome based on Cramer's V and selects the 4 variables with strongest associations (can be changed with option nscreen).
The manual provides more information.
The fitted models of the updated models can be evaluated with fit.updated.formulas.
Value
A list of length 4 containing the updated model formulas:
Lnames |
A vector of strings containing updated model formulas for all L nodes. |
Ynames |
A vector of strings containing updated model formulas for all Y nodes. |
Anames |
A vector of strings containing updated model formulas for all A nodes. |
Cnames |
A vector of strings containing updated model formulas for all C nodes. |
See Also
make.model.formulas, model.update, fit.updated.formulas
Examples
data(EFV)
# first: generate generic model formulas
m <- make.model.formulas(X=EFV,
Lnodes = c("adherence.1","weight.1",
"adherence.2","weight.2",
"adherence.3","weight.3",
"adherence.4","weight.4"
),
Ynodes = c("VL.0","VL.1","VL.2","VL.3","VL.4"),
Anodes = c("efv.0","efv.1","efv.2","efv.3","efv.4"),
evaluate=FALSE)
# second: update these model formulas based on variable screening with LASSO
glmnet.formulas <- model.formulas.update(m$model.names, EFV)
glmnet.formulas
# third: use these models for estimation
est <- gformula(X=EFV,
Lnodes = c("adherence.1","weight.1",
"adherence.2","weight.2",
"adherence.3","weight.3",
"adherence.4","weight.4"
),
Ynodes = c("VL.0","VL.1","VL.2","VL.3","VL.4"),
Anodes = c("efv.0","efv.1","efv.2","efv.3","efv.4"),
Yform=glmnet.formulas$Ynames, Lform=glmnet.formulas$Lnames,
abar=seq(0,2,1)
)
est