| nn_rnn {torch} | R Documentation |
RNN module
Description
Applies a multi-layer Elman RNN with \tanh or \mbox{ReLU} non-linearity
to an input sequence.
Usage
nn_rnn(
input_size,
hidden_size,
num_layers = 1,
nonlinearity = NULL,
bias = TRUE,
batch_first = FALSE,
dropout = 0,
bidirectional = FALSE,
...
)
Arguments
input_size |
The number of expected features in the input |
|
The number of features in the hidden state | |
num_layers |
Number of recurrent layers. E.g., setting |
nonlinearity |
The non-linearity to use. Can be either |
bias |
If |
batch_first |
If |
dropout |
If non-zero, introduces a |
bidirectional |
If |
... |
other arguments that can be passed to the super class. |
Details
For each element in the input sequence, each layer computes the following function:
h_t = \tanh(W_{ih} x_t + b_{ih} + W_{hh} h_{(t-1)} + b_{hh})
where h_t is the hidden state at time t, x_t is
the input at time t, and h_{(t-1)} is the hidden state of the
previous layer at time t-1 or the initial hidden state at time 0.
If nonlinearity is 'relu', then \mbox{ReLU} is used instead of
\tanh.
Inputs
-
input of shape
(seq_len, batch, input_size): tensor containing the features of the input sequence. The input can also be a packed variable length sequence. -
h_0 of shape
(num_layers * num_directions, batch, hidden_size): tensor containing the initial hidden state for each element in the batch. Defaults to zero if not provided. If the RNN is bidirectional, num_directions should be 2, else it should be 1.
Outputs
-
output of shape
(seq_len, batch, num_directions * hidden_size): tensor containing the output features (h_t) from the last layer of the RNN, for eacht. If a :class:nn_packed_sequencehas been given as the input, the output will also be a packed sequence. For the unpacked case, the directions can be separated usingoutput$view(seq_len, batch, num_directions, hidden_size), with forward and backward being direction0and1respectively. Similarly, the directions can be separated in the packed case. -
h_n of shape
(num_layers * num_directions, batch, hidden_size): tensor containing the hidden state fort = seq_len. Like output, the layers can be separated usingh_n$view(num_layers, num_directions, batch, hidden_size).
Shape
Input1:
(L, N, H_{in})tensor containing input features whereH_{in}=\mbox{input\_size}andLrepresents a sequence length.Input2:
(S, N, H_{out})tensor containing the initial hidden state for each element in the batch.H_{out}=\mbox{hidden\_size}Defaults to zero if not provided. whereS=\mbox{num\_layers} * \mbox{num\_directions}If the RNN is bidirectional, num_directions should be 2, else it should be 1.Output1:
(L, N, H_{all})whereH_{all}=\mbox{num\_directions} * \mbox{hidden\_size}Output2:
(S, N, H_{out})tensor containing the next hidden state for each element in the batch
Attributes
-
weight_ih_l[k]: the learnable input-hidden weights of the k-th layer, of shape(hidden_size, input_size)fork = 0. Otherwise, the shape is(hidden_size, num_directions * hidden_size) -
weight_hh_l[k]: the learnable hidden-hidden weights of the k-th layer, of shape(hidden_size, hidden_size) -
bias_ih_l[k]: the learnable input-hidden bias of the k-th layer, of shape(hidden_size) -
bias_hh_l[k]: the learnable hidden-hidden bias of the k-th layer, of shape(hidden_size)
Note
All the weights and biases are initialized from \mathcal{U}(-\sqrt{k}, \sqrt{k})
where k = \frac{1}{\mbox{hidden\_size}}
Examples
if (torch_is_installed()) {
rnn <- nn_rnn(10, 20, 2)
input <- torch_randn(5, 3, 10)
h0 <- torch_randn(2, 3, 20)
rnn(input, h0)
}