nn_rnn {torch} | R Documentation |
RNN module
Description
Applies a multi-layer Elman RNN with or
non-linearity
to an input sequence.
Usage
nn_rnn(
input_size,
hidden_size,
num_layers = 1,
nonlinearity = NULL,
bias = TRUE,
batch_first = FALSE,
dropout = 0,
bidirectional = FALSE,
...
)
Arguments
input_size |
The number of expected features in the input |
The number of features in the hidden state | |
num_layers |
Number of recurrent layers. E.g., setting |
nonlinearity |
The non-linearity to use. Can be either |
bias |
If |
batch_first |
If |
dropout |
If non-zero, introduces a |
bidirectional |
If |
... |
other arguments that can be passed to the super class. |
Details
For each element in the input sequence, each layer computes the following function:
where is the hidden state at time
t
, is
the input at time
t
, and is the hidden state of the
previous layer at time
t-1
or the initial hidden state at time 0
.
If nonlinearity
is 'relu'
, then is used instead of
.
Inputs
-
input of shape
(seq_len, batch, input_size)
: tensor containing the features of the input sequence. The input can also be a packed variable length sequence. -
h_0 of shape
(num_layers * num_directions, batch, hidden_size)
: tensor containing the initial hidden state for each element in the batch. Defaults to zero if not provided. If the RNN is bidirectional, num_directions should be 2, else it should be 1.
Outputs
-
output of shape
(seq_len, batch, num_directions * hidden_size)
: tensor containing the output features (h_t
) from the last layer of the RNN, for eacht
. If a :class:nn_packed_sequence
has been given as the input, the output will also be a packed sequence. For the unpacked case, the directions can be separated usingoutput$view(seq_len, batch, num_directions, hidden_size)
, with forward and backward being direction0
and1
respectively. Similarly, the directions can be separated in the packed case. -
h_n of shape
(num_layers * num_directions, batch, hidden_size)
: tensor containing the hidden state fort = seq_len
. Like output, the layers can be separated usingh_n$view(num_layers, num_directions, batch, hidden_size)
.
Shape
Input1:
tensor containing input features where
and
L
represents a sequence length.Input2:
tensor containing the initial hidden state for each element in the batch.
Defaults to zero if not provided. where
If the RNN is bidirectional, num_directions should be 2, else it should be 1.
Output1:
where
Output2:
tensor containing the next hidden state for each element in the batch
Attributes
-
weight_ih_l[k]
: the learnable input-hidden weights of the k-th layer, of shape(hidden_size, input_size)
fork = 0
. Otherwise, the shape is(hidden_size, num_directions * hidden_size)
-
weight_hh_l[k]
: the learnable hidden-hidden weights of the k-th layer, of shape(hidden_size, hidden_size)
-
bias_ih_l[k]
: the learnable input-hidden bias of the k-th layer, of shape(hidden_size)
-
bias_hh_l[k]
: the learnable hidden-hidden bias of the k-th layer, of shape(hidden_size)
Note
All the weights and biases are initialized from
where
Examples
if (torch_is_installed()) {
rnn <- nn_rnn(10, 20, 2)
input <- torch_randn(5, 3, 10)
h0 <- torch_randn(2, 3, 20)
rnn(input, h0)
}