click.EM {ClickClust}R Documentation

EM algorithm for mixtures of Markov models

Description

Runs the EM algorithm for finite mixture models with Markov model components.

Usage

click.EM(X, y = NULL, K, eps = 1e-10, r = 100, iter = 5, min.beta = 1e-3,
  min.gamma = 1e-3, scale.const = 1)

Arguments

X

dataset array (p x p x n)

y

vector of initial states (length n)

K

number of mixture components

eps

tolerance level

r

number of restarts for initialization

iter

number of iterations for each short EM run

min.beta

lower bound for initial state probabilities

min.gamma

lower bound for transition probabilities

scale.const

scaling constant for avoiding numerical issues

Details

Runs the EM algorithm for finite mixture models with first order Markov model components. The function returns estimated mixing proportions 'alpha' and transition probabilty matrices 'gamma'. If initial states 'y' are not provided, initial state probabilities 'beta' are not estimated and assumed to be equal to 1 / p. In this case, the total number of estimated parameters is given by M = K - 1 + K * p * (p - 1). Otherwise, initial state probabilities 'beta' are also estimated and the total number of parameters is M = K - 1 + K * (p - 1) + K * p * (p - 1). Notation: p - number of states, n - sample size, K - number of mixture components, d - number of equivalence blocks.

Value

z

matrix of posterior probabilities (n x K)

id

classification vector (length n)

alpha

vector of mixing proportions (length K)

beta

matrix of initial state probabilities (K x p)

gamma

array of transition probabilities (p x p x K)

logl

log likelihood value

BIC

Bayesian Information Criterion

References

Melnykov, V. (2016) Model-Based Biclustering of Clickstream Data, Computational Statistics and Data Analysis, 93, 31-45.

Melnykov, V. (2016) ClickClust: An R Package for Model-Based Clustering of Categorical Sequences, Journal of Statistical Software, 74, 1-34.

See Also

click.plot, click.forward, click.backward

Examples



set.seed(123)

n.seq <- 50

p <- 5
K <- 2
mix.prop <- c(0.3, 0.7)


TP1 <- matrix(c(0.20, 0.10, 0.15, 0.15, 0.40,
                0.20, 0.20, 0.20, 0.20, 0.20,
                0.15, 0.10, 0.20, 0.20, 0.35,
                0.15, 0.10, 0.20, 0.20, 0.35,
                0.30, 0.30, 0.10, 0.10, 0.20), byrow = TRUE, ncol = p)

TP2 <- matrix(c(0.15, 0.15, 0.20, 0.20, 0.30,
                0.20, 0.10, 0.30, 0.30, 0.10,
                0.25, 0.20, 0.15, 0.15, 0.25,
                0.25, 0.20, 0.15, 0.15, 0.25,
                0.10, 0.30, 0.20, 0.20, 0.20), byrow = TRUE, ncol = p)


TP <- array(rep(NA, p * p * K), c(p, p, K))
TP[,,1] <- TP1
TP[,,2] <- TP2


# DATA SIMULATION

A <- click.sim(n = n.seq, int = c(10, 50), alpha = mix.prop, gamma = TP)
C <- click.read(A$S)


# EM ALGORITHM (without initial state probabilities)

N2 <- click.EM(X = C$X, K = 2)
N2$BIC


# EM ALGORITHM (with initial state probabilities)

M2 <- click.EM(X = C$X, y = C$y, K = 2)
M2$BIC


[Package ClickClust version 1.1.6 Index]