R: Fitting Semi-Structured Deep Distributional Regression

deepregression {deepregression}

R Documentation

Fitting Semi-Structured Deep Distributional Regression

Description

Fitting Semi-Structured Deep Distributional Regression

Usage

deepregression(
  y,
  list_of_formulas,
  list_of_deep_models = NULL,
  family = "normal",
  data,
  tf_seed = as.integer(1991 - 5 - 4),
  return_prepoc = FALSE,
  subnetwork_builder = subnetwork_init,
  model_builder = keras_dr,
  fitting_function = utils::getFromNamespace("fit.keras.engine.training.Model",
    "keras"),
  additional_processors = list(),
  penalty_options = penalty_control(),
  orthog_options = orthog_control(),
  weight_options = weight_control(),
  formula_options = form_control(),
  output_dim = 1L,
  verbose = FALSE,
  ...
)

Arguments

`y`	response variable
`list_of_formulas`	a named list of right hand side formulas, one for each parameter of the distribution specified in `family`; set to `~ 1` if the parameter should be treated as constant. Use the `s()`-notation from `mgcv` for specification of non-linear structured effects and `d(...)` for deep learning predictors (predictors in brackets are separated by commas), where `d` can be replaced by an name name of the names in `list_of_deep_models`, e.g., `~ 1 + s(x) + my_deep_mod(a,b,c)`, where my_deep_mod is the name of the neural net specified in `list_of_deep_models` and `a,b,c` are features modeled via this network.
`list_of_deep_models`	a named list of functions specifying a keras model. See the examples for more details.
`family`	a character specifying the distribution. For information on possible distribution and parameters, see `make_tfd_dist`. Can also be a custom distribution.
`data`	data.frame or named list with input features
`tf_seed`	a seed for TensorFlow (only works with R version >= 2.2.0)
`return_prepoc`	logical; if TRUE only the pre-processed data and layers are returned (default FALSE).
`subnetwork_builder`	function to build each subnetwork (network for each distribution parameter; per default `subnetwork_init`). Can also be a list of the same size as `list_of_formulas`.
`model_builder`	function to build the model based on additive predictors (per default `keras_dr`). In order to work with the methods defined for the class `deepregression`, the model should behave like a keras model
`fitting_function`	function to fit the instantiated model when calling `fit`. Per default the keras `fit` function.
`additional_processors`	a named list with additional processors to convert the formula(s). Can have an attribute `"controls"` to pass additional controls
`penalty_options`	options for smoothing and penalty terms defined by `penalty_control`
`orthog_options`	options for the orthgonalization defined by `orthog_control`
`weight_options`	options for layer weights defined by `weight_control`
`formula_options`	options for formula parsing (mainly used to make calculation more efficiently)
`output_dim`	dimension of the output, per default 1L
`verbose`	logical; whether to print progress of model initialization to console
`...`	further arguments passed to the `model_builder` function

References

Ruegamer, D. et al. (2023): deepregression: a Flexible Neural Network Framework for Semi-Structured Deep Distributional Regression. doi:10.18637/jss.v105.i02.

Examples

library(deepregression)

n <- 1000
data = data.frame(matrix(rnorm(4*n), c(n,4)))
colnames(data) <- c("x1","x2","x3","xa")
formula <- ~ 1 + deep_model(x1,x2,x3) + s(xa) + x1

deep_model <- function(x) x %>%
layer_dense(units = 32, activation = "relu", use_bias = FALSE) %>%
layer_dropout(rate = 0.2) %>%
layer_dense(units = 8, activation = "relu") %>%
layer_dense(units = 1, activation = "linear")

y <- rnorm(n) + data$xa^2 + data$x1

mod <- deepregression(
  list_of_formulas = list(loc = formula, scale = ~ 1),
  data = data, y = y,
  list_of_deep_models = list(deep_model = deep_model)
)

if(!is.null(mod)){

# train for more than 10 epochs to get a better model
mod %>% fit(epochs = 10, early_stopping = TRUE)
mod %>% fitted() %>% head()
cvres <- mod %>% cv()
mod %>% get_partial_effect(name = "s(xa)")
mod %>% coef()
mod %>% plot()

}

mod <- deepregression(
  list_of_formulas = list(loc = ~ 1 + s(xa) + x1, scale = ~ 1,
                          dummy = ~ -1 + deep_model(x1,x2,x3) %OZ% 1),
  data = data, y = y,
  list_of_deep_models = list(deep_model = deep_model),
  mapping = list(1,2,1:2)
)

[Package deepregression version 1.0.0 Index]