R: Initialization 2 for the beta_{jk} (m=1) or beta

init2.jk.j {poisson.glm.mix}

R Documentation

Initialization 2 for the `\beta_{jk}` (`m=1`) or `\beta_{j}` (`m=2`) parameterization.

Description

This function applies a random splitting small EM initialization scheme (Initialization 2), for parameterizations m=1 or 2. It can be implemented only in case where a previous run of the EM algorithm is available (with respect to the same parameterization). The initialization scheme proposes random splits of the existing clusters, increasing the number of mixture components by one. Then an EM is ran for (msplit) iterations and the procedure is repeated for tsplit times. The best values in terms of observed loglikelihood are chosen to initialize the main EM algorithm (bjkmodel or bjmodel).

Usage

init2.jk.j(reference, response, L, K, tsplit, model, msplit, 
           previousz, previousclust, previous.alpha, previous.beta,mnr)

Arguments

`reference`	a numeric array of dimension `n\times V` containing the `V` covariates for each of the `n` observations.
`response`	a numeric array of count data with dimension `n\times d` containing the `d` response variables for each of the `n` observations.
`L`	numeric vector of positive integers containing the partition of the `d` response variables into `J\leq d` blocks, with `\sum_{j=1}^{J}L_j=d`.
`K`	positive integer denoting the number of mixture components.
`tsplit`	positive integer denoting the number of different runs.
`model`	binary variable denoting the parameterization of the model: 1 for `\beta_{jk}` and 2 for `\beta_{j}` parameterization.
`msplit`	positive integer denoting the number of iterations for each run.
`previousz`	numeric array of dimension `n\times(K-1)` containing the estimates of the posterior probabilities according to the previous run of EM.
`previousclust`	numeric vector of length $n$ containing the estimated clusters according to the MAP rule obtained by the previous run of EM.
`previous.alpha`	numeric array of dimension `J\times (K-1)` containing the matrix of the ML estimates of the regression constants `\alpha_{jk}`, `j=1,\ldots,J`, `k=1,\ldots,K-1`, based on the previous run of EM algorithm.
`previous.beta`	numeric array of dimension `J\times (K-1)\times T` (if `model = 1`) or `J\times T` (if `model = 2`) containing the matrix of the ML estimates of the regression coefficients `\beta_{jk\tau}` or `\beta_{j\tau}`, `j=1,\ldots,J`, `k=1,\ldots,K-1`, `\tau=1,\ldots,T`, based on the previous run of EM algorithm.
`mnr`	positive integer denoting the maximum number of Newton-Raphson iterations.

Value

`alpha`	numeric array of dimension `J \times K` containing the selected values `\alpha_{jk}^{0})`, `j=1,\ldots,J`, `k=1,\ldots,K` that will be used to initialize main EM (`bjkmodel` or `bjmodel`).
`beta`	numeric array of dimension `J \times K \times T` (if `model = 1`) or `J \times T` (if `model = 2`) containing the selected values of `\beta_{jk\tau}^{0})` (or `\beta_{j\tau}^{t})`), `j=1,\ldots,J`, `k=1,\ldots,K`, `\tau=1,\ldots,T`, that will be used to initialize the main EM.
`psim`	numeric vector of length `K` containing the weights that will initialize the main EM.
`ll`	numeric, the value of the loglikelihood, computed according to the `mylogLikePoisMix` function.

Note

In case that an exhaustive search is desired instead of a random selection of the splitted components, use tsplit = -1.

Author(s)

Panagiotis Papastamoulis

Examples



data("simulated_data_15_components_bjk")
x <- sim.data[,1]
x <- array(x,dim=c(length(x),1))
y <- sim.data[,-1]

# At first a 2 component mixture is fitted using parameterization $m=1$.
run.previous<-bjkmodel(reference=x, response=y, L=c(3,2,1), m=100, K=2, 
                       nr=-10*log(10), maxnr=5, m1=2, m2=2, t1=1, t2=2, 
                       msplit, tsplit, prev.z, prev.clust, start.type=1, 
                       prev.alpha, prev.beta)
## Then the estimated clusters and parameters are used to initialize a 
##   3 component mixture using Initialization 2. The number of different
##   runs is set to $tsplit=3$ with each one of them using msplit = 2 
##   em iterations. 
q <- 3
tau <- 1
nc <- 3
z <- run.previous$z
ml <- length(run.previous$psim)/(nc - 1)
alpha <- array(run.previous$alpha[ml, , ], dim = c(q, nc - 1))
beta <- array(run.previous$beta[ml, , , ], dim = c(q, nc - 1, tau))
clust <- run.previous$clust
run<-init2.jk.j(reference=x, response=y, L=c(3,2,1), K=nc, tsplit=2, 
                model=1, msplit=2, previousz=z, previousclust=clust,
                previous.alpha=alpha, previous.beta=beta,mnr = 5)
# note: useR should specify larger values for msplit and tsplit for a complete analysis.

[Package poisson.glm.mix version 1.4 Index]

Initialization 2 for the \beta_{jk} (m=1) or \beta_{j} (m=2) parameterization.