click.EM {ClickClust} | R Documentation |
EM algorithm for mixtures of Markov models
Description
Runs the EM algorithm for finite mixture models with Markov model components.
Usage
click.EM(X, y = NULL, K, eps = 1e-10, r = 100, iter = 5, min.beta = 1e-3,
min.gamma = 1e-3, scale.const = 1)
Arguments
X |
dataset array (p x p x n) |
y |
vector of initial states (length n) |
K |
number of mixture components |
eps |
tolerance level |
r |
number of restarts for initialization |
iter |
number of iterations for each short EM run |
min.beta |
lower bound for initial state probabilities |
min.gamma |
lower bound for transition probabilities |
scale.const |
scaling constant for avoiding numerical issues |
Details
Runs the EM algorithm for finite mixture models with first order Markov model components. The function returns estimated mixing proportions 'alpha' and transition probabilty matrices 'gamma'. If initial states 'y' are not provided, initial state probabilities 'beta' are not estimated and assumed to be equal to 1 / p. In this case, the total number of estimated parameters is given by M = K - 1 + K * p * (p - 1). Otherwise, initial state probabilities 'beta' are also estimated and the total number of parameters is M = K - 1 + K * (p - 1) + K * p * (p - 1). Notation: p - number of states, n - sample size, K - number of mixture components, d - number of equivalence blocks.
Value
z |
matrix of posterior probabilities (n x K) |
id |
classification vector (length n) |
alpha |
vector of mixing proportions (length K) |
beta |
matrix of initial state probabilities (K x p) |
gamma |
array of transition probabilities (p x p x K) |
logl |
log likelihood value |
BIC |
Bayesian Information Criterion |
References
Melnykov, V. (2016) Model-Based Biclustering of Clickstream Data, Computational Statistics and Data Analysis, 93, 31-45.
Melnykov, V. (2016) ClickClust: An R Package for Model-Based Clustering of Categorical Sequences, Journal of Statistical Software, 74, 1-34.
See Also
click.plot, click.forward, click.backward
Examples
set.seed(123)
n.seq <- 50
p <- 5
K <- 2
mix.prop <- c(0.3, 0.7)
TP1 <- matrix(c(0.20, 0.10, 0.15, 0.15, 0.40,
0.20, 0.20, 0.20, 0.20, 0.20,
0.15, 0.10, 0.20, 0.20, 0.35,
0.15, 0.10, 0.20, 0.20, 0.35,
0.30, 0.30, 0.10, 0.10, 0.20), byrow = TRUE, ncol = p)
TP2 <- matrix(c(0.15, 0.15, 0.20, 0.20, 0.30,
0.20, 0.10, 0.30, 0.30, 0.10,
0.25, 0.20, 0.15, 0.15, 0.25,
0.25, 0.20, 0.15, 0.15, 0.25,
0.10, 0.30, 0.20, 0.20, 0.20), byrow = TRUE, ncol = p)
TP <- array(rep(NA, p * p * K), c(p, p, K))
TP[,,1] <- TP1
TP[,,2] <- TP2
# DATA SIMULATION
A <- click.sim(n = n.seq, int = c(10, 50), alpha = mix.prop, gamma = TP)
C <- click.read(A$S)
# EM ALGORITHM (without initial state probabilities)
N2 <- click.EM(X = C$X, K = 2)
N2$BIC
# EM ALGORITHM (with initial state probabilities)
M2 <- click.EM(X = C$X, y = C$y, K = 2)
M2$BIC