dnn {cito} R Documentation

## DNN

### Description

fits a custom deep neural network. dnn() supports the formula syntax and allows to customize the neural network to a maximal degree. So far, only Multilayer Perceptrons are possible. To learn more about Deep Learning, see here

### Usage

dnn(
formula,
data = NULL,
loss = c("mae", "mse", "softmax", "cross-entropy", "gaussian", "binomial", "poisson"),
hidden = c(10L, 10L, 10L),
activation = c("relu", "leaky_relu", "tanh", "elu", "rrelu", "prelu", "softplus",
"celu", "selu", "gelu", "relu6", "sigmoid", "softsign", "hardtanh", "tanhshrink",
"softshrink", "hardshrink", "log_sigmoid"),
validation = 0,
bias = TRUE,
lambda = 0,
alpha = 0.5,
dropout = 0,
lr = 0.01,
batchsize = 32L,
shuffle = FALSE,
epochs = 32,
plot = TRUE,
verbose = TRUE,
lr_scheduler = NULL,
device = c("cpu", "cuda"),
early_stopping = FALSE
)


### Arguments

 formula an object of class "formula": a description of the model that should be fitted data matrix or data.frame loss loss after which network should be optimized. Can also be distribution from the stats package or own function hidden hidden units in layers, length of hidden corresponds to number of layers activation activation functions, can be of length one, or a vector of different activation functions for each layer validation percentage of data set that should be taken as validation set (chosen randomly) bias whether use biases in the layers, can be of length one, or a vector (number of hidden layers + 1 (last layer)) of logicals for each layer. lambda strength of regularization: lambda penalty, \lambda * (L1 + L2) (see alpha) alpha add L1/L2 regularization to training (1 - \alpha) * |weights| + \alpha ||weights||^2 will get added for each layer. Can be single integer between 0 and 1 or vector of alpha values if layers should be regularized differently. dropout dropout rate, probability of a node getting left out during training (see nn_dropout) optimizer which optimizer used for training the network, for more adjustments to optimizer see config_optimizer lr learning rate given to optimizer batchsize number of samples that are used to calculate one learning rate step shuffle if TRUE, data in each batch gets reshuffled every epoch epochs epochs the training goes on for plot plot training loss verbose print training and validation loss of epochs lr_scheduler learning rate scheduler created with config_lr_scheduler device device on which network should be trained on. early_stopping if set to integer, training will stop if validation loss worsened between current defined past epoch.

### Details

In a Multilayer Perceptron (MLP) network every neuron is connected with all neurons of the previous layer and connected to all neurons of the layer afterwards. The value of each neuron is calculated with:

 a (\sum_j{ w_j * a_j})

Where w_j is the weight and a_j is the value from neuron j to the current one. a() is the activation function, e.g.  relu(x) = max(0,x) As regularization methods there is dropout and elastic net regularization available. These methods help you avoid over fitting.

Training on graphic cards: If you want to train on your cuda devide, you have to install the NVIDIA CUDA toolkit version 11.3. and cuDNN 8.4. beforehand. Make sure that you have xactly these versions installed, since it does not wor kwith other version. For more information see mlverse: 'torch'

### Value

an S3 object of class "cito.dnn" is returned. It is a list containing everything there is to know about the model and its training process. The list consists of the following attributes:

 net An object of class "nn_sequential" "nn_module", originates from the torch package and represents the core object of this workflow. call The original function call loss A list which contains relevant information for the target variable and the used loss function data Contains data used for training the model weigths List of weights for each training epoch use_model_epoch Integer, which defines which model from which training epoch should be used for prediction. loaded_model_epoch Integer, shows which model from which epoch is loaded currently into model$net. model_properties A list of properties of the neural network, contains number of input nodes, number of output nodes, size of hidden layers, activation functions, whether bias is included and if dropout layers are included. training_properties A list of all training parameters that were used the last time the model was trained. It consists of learning rate, information about an learning rate scheduler, information about the optimizer, number of epochs, whether early stopping was used, if plot was active, lambda and alpha for L1/L2 regularization, batchsize, shuffle, was the data set split into validation and training, which formula was used for training and at which epoch did the training stop. losses A data.frame containing training and validation losses of each epoch ### See Also predict.citodnn, plot.citodnn, coef.citodnn,print.citodnn, summary.citodnn, continue_training, analyze_training, PDP, ALE, ### Examples  if(torch::torch_is_installed()){ library(cito) set.seed(222) validation_set<- sample(c(1:nrow(datasets::iris)),25) # Build and train Network nn.fit<- dnn(Sepal.Length~., data = datasets::iris[-validation_set,]) # Sturcture of Neural Network print(nn.fit) # Use model on validation set predictions <- predict(nn.fit, iris[validation_set,]) # Scatterplot plot(iris[validation_set,]$Sepal.Length,predictions)
# MAE
mean(abs(predictions-iris[validation_set,]\$Sepal.Length))

# Get variable importances
summary(nn.fit)

# Partial dependencies
PDP(nn.fit, variable = "Petal.Length")

# Accumulated local effect plots
ALE(nn.fit, variable = "Petal.Length")

}



[Package cito version 1.0.0 Index]