create_formula {sgboost} | R Documentation |
Create a sparse-group boosting formula
Description
Creates a mboost
formula that allows to fit a sparse-group boosting model based on
boosted Ridge Regression with mixing parameter alpha
. The formula consists of a
group baselearner part with degrees of freedom
1-alpha
and individual baselearners with degrees of freedom alpha
.
Groups should be defined through group_df
. The corresponding modeling data
should not contain categorical variables with more than two categories,
as they are then treated as a group only.
Usage
create_formula(
alpha = 0.3,
group_df = NULL,
blearner = "bols",
outcome_name = "y",
group_name = "group_name",
var_name = "var_name",
intercept = FALSE
)
Arguments
alpha |
Numeric mixing parameter. For alpha = 0 only group baselearners and for alpha = 1 only individual baselearners are defined. |
group_df |
input data.frame containing variable names with group structure. |
blearner |
Type of baselearner. Default is |
outcome_name |
String indicating the name of dependent variable. Default is |
group_name |
Name of column in group_df indicating the group structure of the variables.
Default is |
var_name |
Name of column in group_df containing the variable names
to be used as predictors. Default is |
intercept |
Logical, should intercept be used? |
Value
Character containing the formula to be passed to mboost::mboost()
yielding the sparse-group boosting for a given value mixing parameter alpha
.
Examples
library(mboost)
library(dplyr)
set.seed(1)
df <- data.frame(
x1 = rnorm(100), x2 = rnorm(100), x3 = rnorm(100),
x4 = rnorm(100), x5 = runif(100)
)
df <- df %>%
mutate_all(function(x) {
as.numeric(scale(x))
})
df$y <- df$x1 + df$x4 + df$x5
group_df <- data.frame(
group_name = c(1, 1, 1, 2, 2),
var_name = c("x1", "x2", "x3", "x4", "x5")
)
sgb_formula <- create_formula(alpha = 0.3, group_df = group_df)
sgb_model <- mboost(formula = sgb_formula, data = df)
summary(sgb_model)