dnn {cito}R Documentation

DNN

Description

fits a custom deep neural network. dnn() supports the formula syntax and allows to customize the neural network to a maximal degree. So far, only Multilayer Perceptrons are possible. To learn more about Deep Learning, see here

Usage

dnn(
  formula,
  data = NULL,
  loss = c("mae", "mse", "softmax", "cross-entropy", "gaussian", "binomial", "poisson"),
  hidden = c(10L, 10L, 10L),
  activation = c("relu", "leaky_relu", "tanh", "elu", "rrelu", "prelu", "softplus",
    "celu", "selu", "gelu", "relu6", "sigmoid", "softsign", "hardtanh", "tanhshrink",
    "softshrink", "hardshrink", "log_sigmoid"),
  validation = 0,
  bias = TRUE,
  lambda = 0,
  alpha = 0.5,
  dropout = 0,
  optimizer = c("adam", "adadelta", "adagrad", "rmsprop", "rprop", "sgd"),
  lr = 0.01,
  batchsize = 32L,
  shuffle = FALSE,
  epochs = 32,
  plot = TRUE,
  verbose = TRUE,
  lr_scheduler = NULL,
  device = c("cpu", "cuda"),
  early_stopping = FALSE
)

Arguments

formula

an object of class "formula": a description of the model that should be fitted

data

matrix or data.frame

loss

loss after which network should be optimized. Can also be distribution from the stats package or own function

hidden

hidden units in layers, length of hidden corresponds to number of layers

activation

activation functions, can be of length one, or a vector of different activation functions for each layer

validation

percentage of data set that should be taken as validation set (chosen randomly)

bias

whether use biases in the layers, can be of length one, or a vector (number of hidden layers + 1 (last layer)) of logicals for each layer.

lambda

strength of regularization: lambda penalty, \lambda * (L1 + L2) (see alpha)

alpha

add L1/L2 regularization to training (1 - \alpha) * |weights| + \alpha ||weights||^2 will get added for each layer. Can be single integer between 0 and 1 or vector of alpha values if layers should be regularized differently.

dropout

dropout rate, probability of a node getting left out during training (see nn_dropout)

optimizer

which optimizer used for training the network, for more adjustments to optimizer see config_optimizer

lr

learning rate given to optimizer

batchsize

number of samples that are used to calculate one learning rate step

shuffle

if TRUE, data in each batch gets reshuffled every epoch

epochs

epochs the training goes on for

plot

plot training loss

verbose

print training and validation loss of epochs

lr_scheduler

learning rate scheduler created with config_lr_scheduler

device

device on which network should be trained on.

early_stopping

if set to integer, training will stop if validation loss worsened between current defined past epoch.

Details

In a Multilayer Perceptron (MLP) network every neuron is connected with all neurons of the previous layer and connected to all neurons of the layer afterwards. The value of each neuron is calculated with:

a (\sum_j{ w_j * a_j})

Where w_j is the weight and a_j is the value from neuron j to the current one. a() is the activation function, e.g. relu(x) = max(0,x) As regularization methods there is dropout and elastic net regularization available. These methods help you avoid over fitting.

Training on graphic cards: If you want to train on your cuda devide, you have to install the NVIDIA CUDA toolkit version 11.3. and cuDNN 8.4. beforehand. Make sure that you have xactly these versions installed, since it does not wor kwith other version. For more information see mlverse: 'torch'

Value

an S3 object of class "cito.dnn" is returned. It is a list containing everything there is to know about the model and its training process. The list consists of the following attributes:

net

An object of class "nn_sequential" "nn_module", originates from the torch package and represents the core object of this workflow.

call

The original function call

loss

A list which contains relevant information for the target variable and the used loss function

data

Contains data used for training the model

weigths

List of weights for each training epoch

use_model_epoch

Integer, which defines which model from which training epoch should be used for prediction.

loaded_model_epoch

Integer, shows which model from which epoch is loaded currently into model$net.

model_properties

A list of properties of the neural network, contains number of input nodes, number of output nodes, size of hidden layers, activation functions, whether bias is included and if dropout layers are included.

training_properties

A list of all training parameters that were used the last time the model was trained. It consists of learning rate, information about an learning rate scheduler, information about the optimizer, number of epochs, whether early stopping was used, if plot was active, lambda and alpha for L1/L2 regularization, batchsize, shuffle, was the data set split into validation and training, which formula was used for training and at which epoch did the training stop.

losses

A data.frame containing training and validation losses of each epoch

See Also

predict.citodnn, plot.citodnn, coef.citodnn,print.citodnn, summary.citodnn, continue_training, analyze_training, PDP, ALE,

Examples


if(torch::torch_is_installed()){
library(cito)

set.seed(222)
validation_set<- sample(c(1:nrow(datasets::iris)),25)

# Build and train  Network
nn.fit<- dnn(Sepal.Length~., data = datasets::iris[-validation_set,])

# Sturcture of Neural Network
print(nn.fit)

# Use model on validation set
predictions <- predict(nn.fit, iris[validation_set,])

# Scatterplot
plot(iris[validation_set,]$Sepal.Length,predictions)
# MAE
mean(abs(predictions-iris[validation_set,]$Sepal.Length))

# Get variable importances
summary(nn.fit)

# Partial dependencies
PDP(nn.fit, variable = "Petal.Length")

# Accumulated local effect plots
ALE(nn.fit, variable = "Petal.Length")

}


[Package cito version 1.0.0 Index]