| layer_attention {keras} | R Documentation |
Dot-product attention layer, a.k.a. Luong-style attention
Description
Dot-product attention layer, a.k.a. Luong-style attention
Usage
layer_attention(
inputs,
use_scale = FALSE,
score_mode = "dot",
...,
dropout = NULL
)
Arguments
inputs |
List of the following tensors:
|
use_scale |
If |
score_mode |
Function to use to compute attention scores, one of
|
... |
standard layer arguments (e.g., batch_size, dtype, name, trainable, weights) |
dropout |
Float between 0 and 1. Fraction of the units to drop for the attention scores. Defaults to 0.0. |
Details
inputs are query tensor of shape [batch_size, Tq, dim], value tensor
of shape [batch_size, Tv, dim] and key tensor of shape
[batch_size, Tv, dim]. The calculation follows the steps:
Calculate scores with shape
[batch_size, Tq, Tv]as aquery-keydot product:scores = tf$matmul(query, key, transpose_b=TRUE).Use scores to calculate a distribution with shape
[batch_size, Tq, Tv]:distribution = tf$nn$softmax(scores).Use
distributionto create a linear combination ofvaluewith shape[batch_size, Tq, dim]: returntf$matmul(distribution, value).
See Also
Other core layers:
layer_activation(),
layer_activity_regularization(),
layer_dense(),
layer_dense_features(),
layer_dropout(),
layer_flatten(),
layer_input(),
layer_lambda(),
layer_masking(),
layer_permute(),
layer_repeat_vector(),
layer_reshape()