| tseq2feature_seq2seq {ProcData} | R Documentation | 
Feature Extraction by time sequence autoencoder
Description
tseq2feature_seq2seq extract features from timestamps of action sequences by a 
sequence autoencoder.
Usage
tseq2feature_seq2seq(tseqs, K, cumulative = FALSE, log = TRUE,
  rnn_type = "lstm", n_epoch = 50, method = "last",
  step_size = 1e-04, optimizer_name = "rmsprop", samples_train,
  samples_valid, samples_test = NULL, pca = TRUE, verbose = TRUE,
  return_theta = TRUE)
Arguments
| tseqs | a list of  | 
| K | the number of features to be extracted. | 
| cumulative | logical. If TRUE, the sequence of cumulative time up to each event is used as input to the neural network. If FALSE, the sequence of inter-arrival time (gap time between an event and the previous event) will be used as input to the neural network. Default is FALSE. | 
| log | logical. If TRUE, for the timestamp sequences, input of the neural net is the base-10 log of the original sequence of times plus 1 (i.e., log10(t+1)). If FALSE, the original sequence of times is used. | 
| rnn_type | the type of recurrent unit to be used for modeling
response processes.  | 
| n_epoch | the number of training epochs for the autoencoder. | 
| method | the method for computing features from the output of an
recurrent neural network in the encoder. Available options are 
 | 
| step_size | the learning rate of optimizer. | 
| optimizer_name | a character string specifying the optimizer to be used
for training. Availabel options are  | 
| samples_train | vectors of indices specifying the training, validation and test sets for training autoencoder. | 
| samples_valid | vectors of indices specifying the training, validation and test sets for training autoencoder. | 
| samples_test | vectors of indices specifying the training, validation and test sets for training autoencoder. | 
| pca | logical. If TRUE, the principal components of features are returned. Default is TRUE. | 
| verbose | logical. If TRUE, training progress is printed. | 
| return_theta | logical. If TRUE, extracted features are returned. | 
Details
This function trains a sequence-to-sequence autoencoder using keras. The encoder of the autoencoder consists of a recurrent neural network. The decoder consists of another recurrent neural network and a fully connected layer with ReLU activation. The outputs of the encoder are the extracted features.
The output of the encoder is a function of the encoder recurrent neural network.
It is the last latent state of the encoder recurrent neural network if method="last"
and the average of the encoder recurrent neural network latent states if method="avg".
Value
tseq2feature_seq2seq returns a list containing
| theta | a matrix containing  | 
| train_loss | a vector of length  | 
| valid_loss | a vector of length  | 
| test_loss | a vector of length  | 
See Also
chooseK_seq2seq for choosing K through cross-validation.
Other feature extraction methods: aseq2feature_seq2seq,
atseq2feature_seq2seq,
seq2feature_mds_large,
seq2feature_mds,
seq2feature_ngram,
seq2feature_seq2seq
Examples
if (!system("python -c 'import tensorflow as tf'", ignore.stdout = TRUE, ignore.stderr= TRUE)) {
  n <- 50
  data(cc_data)
  samples <- sample(1:length(cc_data$seqs$time_seqs), n)
  tseqs <- cc_data$seqs$time_seqs[samples]
  time_seq2seq_res <- tseq2feature_seq2seq(tseqs, 5, rnn_type="lstm", n_epoch=5, 
                                   samples_train=1:40, samples_valid=41:50)
  features <- time_seq2seq_res$theta
  plot(time_seq2seq_res$train_loss, col="blue", type="l",
     ylim = range(c(time_seq2seq_res$train_loss, time_seq2seq_res$valid_loss)))
  lines(time_seq2seq_res$valid_loss, col="red", type = 'l')
}