tseq2feature_seq2seq {ProcData} | R Documentation |
Feature Extraction by time sequence autoencoder
Description
tseq2feature_seq2seq
extract features from timestamps of action sequences by a
sequence autoencoder.
Usage
tseq2feature_seq2seq(tseqs, K, cumulative = FALSE, log = TRUE,
rnn_type = "lstm", n_epoch = 50, method = "last",
step_size = 1e-04, optimizer_name = "rmsprop", samples_train,
samples_valid, samples_test = NULL, pca = TRUE, verbose = TRUE,
return_theta = TRUE)
Arguments
tseqs |
a list of |
K |
the number of features to be extracted. |
cumulative |
logical. If TRUE, the sequence of cumulative time up to each event is used as input to the neural network. If FALSE, the sequence of inter-arrival time (gap time between an event and the previous event) will be used as input to the neural network. Default is FALSE. |
log |
logical. If TRUE, for the timestamp sequences, input of the neural net is the base-10 log of the original sequence of times plus 1 (i.e., log10(t+1)). If FALSE, the original sequence of times is used. |
rnn_type |
the type of recurrent unit to be used for modeling
response processes. |
n_epoch |
the number of training epochs for the autoencoder. |
method |
the method for computing features from the output of an
recurrent neural network in the encoder. Available options are
|
step_size |
the learning rate of optimizer. |
optimizer_name |
a character string specifying the optimizer to be used
for training. Availabel options are |
samples_train |
vectors of indices specifying the training, validation and test sets for training autoencoder. |
samples_valid |
vectors of indices specifying the training, validation and test sets for training autoencoder. |
samples_test |
vectors of indices specifying the training, validation and test sets for training autoencoder. |
pca |
logical. If TRUE, the principal components of features are returned. Default is TRUE. |
verbose |
logical. If TRUE, training progress is printed. |
return_theta |
logical. If TRUE, extracted features are returned. |
Details
This function trains a sequence-to-sequence autoencoder using keras. The encoder of the autoencoder consists of a recurrent neural network. The decoder consists of another recurrent neural network and a fully connected layer with ReLU activation. The outputs of the encoder are the extracted features.
The output of the encoder is a function of the encoder recurrent neural network.
It is the last latent state of the encoder recurrent neural network if method="last"
and the average of the encoder recurrent neural network latent states if method="avg"
.
Value
tseq2feature_seq2seq
returns a list containing
theta |
a matrix containing |
train_loss |
a vector of length |
valid_loss |
a vector of length |
test_loss |
a vector of length |
See Also
chooseK_seq2seq
for choosing K
through cross-validation.
Other feature extraction methods: aseq2feature_seq2seq
,
atseq2feature_seq2seq
,
seq2feature_mds_large
,
seq2feature_mds
,
seq2feature_ngram
,
seq2feature_seq2seq
Examples
if (!system("python -c 'import tensorflow as tf'", ignore.stdout = TRUE, ignore.stderr= TRUE)) {
n <- 50
data(cc_data)
samples <- sample(1:length(cc_data$seqs$time_seqs), n)
tseqs <- cc_data$seqs$time_seqs[samples]
time_seq2seq_res <- tseq2feature_seq2seq(tseqs, 5, rnn_type="lstm", n_epoch=5,
samples_train=1:40, samples_valid=41:50)
features <- time_seq2seq_res$theta
plot(time_seq2seq_res$train_loss, col="blue", type="l",
ylim = range(c(time_seq2seq_res$train_loss, time_seq2seq_res$valid_loss)))
lines(time_seq2seq_res$valid_loss, col="red", type = 'l')
}