atseq2feature_seq2seq {ProcData} | R Documentation |
Feature Extraction by action and time sequence autoencoder
Description
atseq2feature_seq2seq
extract features from action and timestamp sequences by a
sequence autoencoder.
Usage
atseq2feature_seq2seq(atseqs, K, weights = c(1, 0.5),
cumulative = FALSE, log = TRUE, rnn_type = "lstm", n_epoch = 50,
method = "last", step_size = 1e-04, optimizer_name = "rmsprop",
samples_train, samples_valid, samples_test = NULL, pca = TRUE,
verbose = TRUE, return_theta = TRUE)
Arguments
atseqs |
a list of two elements, first element is the list of |
K |
the number of features to be extracted. |
weights |
a vector of 2 elements for the weight of the loss of action sequences (categorical_crossentropy) and time sequences (mean squared error), respectively. The total loss is calculated as the weighted sum of the two losses. |
cumulative |
logical. If TRUE, the sequence of cumulative time up to each event is used as input to the neural network. If FALSE, the sequence of inter-arrival time (gap time between an event and the previous event) will be used as input to the neural network. Default is FALSE. |
log |
logical. If TRUE, for the timestamp sequences, input of the neural net is the base-10 log of the original sequence of times plus 1 (i.e., log10(t+1)). If FALSE, the original sequence of times is used. |
rnn_type |
the type of recurrent unit to be used for modeling
response processes. |
n_epoch |
the number of training epochs for the autoencoder. |
method |
the method for computing features from the output of an
recurrent neural network in the encoder. Available options are
|
step_size |
the learning rate of optimizer. |
optimizer_name |
a character string specifying the optimizer to be used
for training. Availabel options are |
samples_train |
vectors of indices specifying the training, validation and test sets for training autoencoder. |
samples_valid |
vectors of indices specifying the training, validation and test sets for training autoencoder. |
samples_test |
vectors of indices specifying the training, validation and test sets for training autoencoder. |
pca |
logical. If TRUE, the principal components of features are returned. Default is TRUE. |
verbose |
logical. If TRUE, training progress is printed. |
return_theta |
logical. If TRUE, extracted features are returned. |
Details
This function trains a sequence-to-sequence autoencoder using keras. The encoder of the autoencoder consists of a recurrent neural network. The decoder consists of another recurrent neural network followed by a fully connected layer with softmax activation for actions and another fully connected layer with ReLU activation for times. The outputs of the encoder are the extracted features.
The output of the encoder is a function of the encoder recurrent neural network.
It is the last latent state of the encoder recurrent neural network if method="last"
and the average of the encoder recurrent neural network latent states if method="avg"
.
Value
tseq2feature_seq2seq
returns a list containing
theta |
a matrix containing |
train_loss |
a vector of length |
valid_loss |
a vector of length |
test_loss |
a vector of length |
See Also
chooseK_seq2seq
for choosing K
through cross-validation.
Other feature extraction methods: aseq2feature_seq2seq
,
seq2feature_mds_large
,
seq2feature_mds
,
seq2feature_ngram
,
seq2feature_seq2seq
,
tseq2feature_seq2seq
Examples
if (!system("python -c 'import tensorflow as tf'", ignore.stdout = TRUE, ignore.stderr= TRUE)) {
n <- 50
data(cc_data)
samples <- sample(1:length(cc_data$seqs$time_seqs), n)
atseqs <- sub_seqs(cc_data$seqs, samples)
action_and_time_seq2seq_res <- atseq2feature_seq2seq(atseqs, 5, rnn_type="lstm", n_epoch=5,
samples_train=1:40, samples_valid=41:50)
features <- action_and_time_seq2seq_res$theta
plot(action_and_time_seq2seq_res$train_loss, col="blue", type="l",
ylim = range(c(action_and_time_seq2seq_res$train_loss,
action_and_time_seq2seq_res$valid_loss)))
lines(action_and_time_seq2seq_res$valid_loss, col="red", type = 'l')
}