chooseK_mds {ProcData}R Documentation

Choose the number of multidimensional scaling features

Description

chooseK_mds choose the number of multidimensional scaling features to be extracted by cross-validation.

Usage

chooseK_mds(seqs = NULL, K_cand, dist_type = "oss_action",
  n_fold = 5, max_epoch = 100, step_size = 0.01, tot = 1e-06,
  return_dist = FALSE, L_set = 1:3)

Arguments

seqs

a "proc" object or a square matrix. If a squared matrix is provided, it is treated as the dissimilary matrix of a group of response processes.

K_cand

the candidates of the number of features.

dist_type

a character string specifies the dissimilarity measure for two response processes. See 'Details'.

n_fold

the number of folds for cross-validation.

max_epoch

the maximum number of epochs for stochastic gradient descent.

step_size

the step size of stochastic gradient descent.

tot

the accuracy tolerance for determining convergence.

return_dist

logical. If TRUE, the dissimilarity matrix will be returned. Default is FALSE.

L_set

length of ngrams considered

Value

chooseK_mds returns a list containing

K

the value in K_cand producing the smallest cross-validation loss.

K_cand

the candidates of the number of features.

cv_loss

the cross-validation loss for each candidate in K_cand.

dist_mat

the dissimilary matrix. This element exists only if return_dist=TRUE.

References

Gomez-Alonso, C. and Valls, A. (2008). A similarity measure for sequences of categorical data based on the ordering of common elements. In V. Torra & Y. Narukawa (Eds.) Modeling Decisions for Artificial Intelligence, (pp. 134-145). Springer Berlin Heidelberg.

See Also

seq2feature_mds for feature extraction after choosing the number of features.

Examples

n <- 50
set.seed(12345)
seqs <- seq_gen(n)
K_res <- chooseK_mds(seqs, 5:10, return_dist=TRUE)
theta <- seq2feature_mds(K_res$dist_mat, K_res$K)$theta


[Package ProcData version 0.3.2 Index]