nn_bce_with_logits_loss {torch} | R Documentation |
BCE with logits loss
Description
This loss combines a Sigmoid
layer and the BCELoss
in one single
class. This version is more numerically stable than using a plain Sigmoid
followed by a BCELoss
as, by combining the operations into one layer,
we take advantage of the log-sum-exp trick for numerical stability.
Usage
nn_bce_with_logits_loss(weight = NULL, reduction = "mean", pos_weight = NULL)
Arguments
weight |
(Tensor, optional): a manual rescaling weight given to the loss
of each batch element. If given, has to be a Tensor of size |
reduction |
(string, optional): Specifies the reduction to apply to the output:
|
pos_weight |
(Tensor, optional): a weight of positive examples. Must be a vector with length equal to the number of classes. |
Details
The unreduced (i.e. with reduction
set to 'none'
) loss can be described as:
where is the batch size. If
reduction
is not 'none'
(default 'mean'
), then
This is used for measuring the error of a reconstruction in for example
an auto-encoder. Note that the targets t[i]
should be numbers
between 0 and 1.
It's possible to trade off recall and precision by adding weights to positive examples.
In the case of multi-label classification the loss can be described as:
where is the class number (
for multi-label binary
classification,
for single-label binary classification),
is the number of the sample in the batch and
is the weight of the positive answer for the class
.
increases the recall,
increases the precision.
For example, if a dataset contains 100 positive and 300 negative examples of a single class,
then
pos_weight
for the class should be equal to .
The loss would act as if the dataset contains
positive examples.
Shape
Input:
where
means, any number of additional dimensions
Target:
, same shape as the input
Output: scalar. If
reduction
is'none'
, then, same shape as input.
Examples
if (torch_is_installed()) {
loss <- nn_bce_with_logits_loss()
input <- torch_randn(3, requires_grad = TRUE)
target <- torch_empty(3)$random_(1, 2)
output <- loss(input, target)
output$backward()
target <- torch_ones(10, 64, dtype = torch_float32()) # 64 classes, batch size = 10
output <- torch_full(c(10, 64), 1.5) # A prediction (logit)
pos_weight <- torch_ones(64) # All weights are equal to 1
criterion <- nn_bce_with_logits_loss(pos_weight = pos_weight)
criterion(output, target) # -log(sigmoid(1.5))
}