nn_rnn {torch} | R Documentation |
RNN module
Description
Applies a multi-layer Elman RNN with \tanh
or \mbox{ReLU}
non-linearity
to an input sequence.
Usage
nn_rnn(
input_size,
hidden_size,
num_layers = 1,
nonlinearity = NULL,
bias = TRUE,
batch_first = FALSE,
dropout = 0,
bidirectional = FALSE,
...
)
Arguments
input_size |
The number of expected features in the input |
The number of features in the hidden state | |
num_layers |
Number of recurrent layers. E.g., setting |
nonlinearity |
The non-linearity to use. Can be either |
bias |
If |
batch_first |
If |
dropout |
If non-zero, introduces a |
bidirectional |
If |
... |
other arguments that can be passed to the super class. |
Details
For each element in the input sequence, each layer computes the following function:
h_t = \tanh(W_{ih} x_t + b_{ih} + W_{hh} h_{(t-1)} + b_{hh})
where h_t
is the hidden state at time t
, x_t
is
the input at time t
, and h_{(t-1)}
is the hidden state of the
previous layer at time t-1
or the initial hidden state at time 0
.
If nonlinearity
is 'relu'
, then \mbox{ReLU}
is used instead of
\tanh
.
Inputs
-
input of shape
(seq_len, batch, input_size)
: tensor containing the features of the input sequence. The input can also be a packed variable length sequence. -
h_0 of shape
(num_layers * num_directions, batch, hidden_size)
: tensor containing the initial hidden state for each element in the batch. Defaults to zero if not provided. If the RNN is bidirectional, num_directions should be 2, else it should be 1.
Outputs
-
output of shape
(seq_len, batch, num_directions * hidden_size)
: tensor containing the output features (h_t
) from the last layer of the RNN, for eacht
. If a :class:nn_packed_sequence
has been given as the input, the output will also be a packed sequence. For the unpacked case, the directions can be separated usingoutput$view(seq_len, batch, num_directions, hidden_size)
, with forward and backward being direction0
and1
respectively. Similarly, the directions can be separated in the packed case. -
h_n of shape
(num_layers * num_directions, batch, hidden_size)
: tensor containing the hidden state fort = seq_len
. Like output, the layers can be separated usingh_n$view(num_layers, num_directions, batch, hidden_size)
.
Shape
Input1:
(L, N, H_{in})
tensor containing input features whereH_{in}=\mbox{input\_size}
andL
represents a sequence length.Input2:
(S, N, H_{out})
tensor containing the initial hidden state for each element in the batch.H_{out}=\mbox{hidden\_size}
Defaults to zero if not provided. whereS=\mbox{num\_layers} * \mbox{num\_directions}
If the RNN is bidirectional, num_directions should be 2, else it should be 1.Output1:
(L, N, H_{all})
whereH_{all}=\mbox{num\_directions} * \mbox{hidden\_size}
Output2:
(S, N, H_{out})
tensor containing the next hidden state for each element in the batch
Attributes
-
weight_ih_l[k]
: the learnable input-hidden weights of the k-th layer, of shape(hidden_size, input_size)
fork = 0
. Otherwise, the shape is(hidden_size, num_directions * hidden_size)
-
weight_hh_l[k]
: the learnable hidden-hidden weights of the k-th layer, of shape(hidden_size, hidden_size)
-
bias_ih_l[k]
: the learnable input-hidden bias of the k-th layer, of shape(hidden_size)
-
bias_hh_l[k]
: the learnable hidden-hidden bias of the k-th layer, of shape(hidden_size)
Note
All the weights and biases are initialized from \mathcal{U}(-\sqrt{k}, \sqrt{k})
where k = \frac{1}{\mbox{hidden\_size}}
Examples
if (torch_is_installed()) {
rnn <- nn_rnn(10, 20, 2)
input <- torch_randn(5, 3, 10)
h0 <- torch_randn(2, 3, 20)
rnn(input, h0)
}