turboem {turboEM} | R Documentation |
A suite of acceleration schemes for fixed-point iterations
Description
Globally-convergent, partially monotone, acceleration schemes for accelerating the convergence of any smooth, monotone, slowly-converging contraction mapping. It can be used to accelerate the convergence of a wide variety of iterations including the expectation-maximization (EM) algorithms and its variants, majorization-minimization (MM) algorithm, power method for dominant eigenvalue-eigenvector, Google's page-rank algorithm, and multi-dimensional scaling.
Usage
turboem(par, fixptfn, objfn, method = c("em","squarem","pem","decme","qn"),
boundary, pconstr = NULL, project = NULL, parallel = FALSE, ...,
control.method = replicate(length(method),list()), control.run = list())
Arguments
par |
A vector of parameters denoting the initial guess for the fixed point. |
fixptfn |
A vector function, |
objfn |
This is a scalar function, |
method |
Specifies which algorithm(s) will be applied. Must be a vector containing one or more of |
boundary |
Argument required for Dynamic ECME ( |
pconstr |
Optional function for defining boundary constraints on parameter values. Function maps a vector of parameter values to TRUE if constraints are satisfied. Note that this argument is only used for the Squarem ( |
project |
Optional function for defining a projection that maps an out-of-bound parameter value into the constrained parameter space. Requires the |
parallel |
Logical indicating whether the acceleration schemes will be run in parallel. Note that the parallel implementation is based on the |
control.method |
If |
control.run |
List of control parameters for convergence and stopping the algorithms. See *Details*. |
... |
Arguments passed to |
Details
The function turboem
is a general-purpose algorithm for accelerating the convergence of any slowly-convergent (smooth) fixed-point iteration.
The component lists of the control.method
are used to specify any changes to default values of algorithm control parameters. Full names of control list elements must be specified, otherwise, user specifications are ignored. Default control parameters for method="squarem"
are K=1
, square=TRUE
, version=3
, step.min0=1
, step.max0=1
, mstep=4
, kr=1
, objfn.inc=1
. Default control parameters for method="pem"
are l=10
, h=0.1
, a=1.5
, and version="geometric"
. Default control parameters for method="decme"
are version="v2"
and tol_op=0.01
. Default control parameters for method="qn"
are qn=5
.
Default values of control.run
are: convtype = "parameter"
, tol = 1.0e-07
, stoptype = "maxiter"
, maxiter = 1500
, maxtime = 60
, convfn.user = NULL
, stopfn.user = NULL
, trace = FALSE
, keep.objfval = FALSE
, keep.paramval = FALSE
.
There are two ways the algorithm will terminate. Either the algorithm will terminate if convergence has been achieved, or the algorithm will terminate if convergence has not been achieved within a pre-specified maximum number of iterations or maximum running time. The arguments convtype
, tol
, and convfn.user
control the convergence criterion. The arguments stoptype
, maxiter
, maxtime
, and stopfn.user
control the alternative stopping criterion.
Two types of convergence criteria have been implemented, with an option for the user to define his/her own convergence criterion. If convtype = "parameter"
, then the default convergence criterion is to terminate if sqrt(crossprod(new - old)) < tol
, where new
denotes the current value of the fixed point and old
denotes the previous fixed-point value. If convtype = "objfn"
, then the default convergence criterion is to terminate if abs(new - old) < tol
, where new
denotes the current value of the objective function and old
denotes the previous value of the objective function.
If the user desires alternate convergence criteria, convfn.user
may be specified as a function with inputs new
and old
that maps to a logical taking the value TRUE if convergence is achieved and the value FALSE if convergence is not achieved.
Two types of alternative stopping criteria have been implemented, with the option for the user to define his/her own stopping criterion. If stoptype = "maxiter"
, then the algorithm will terminate if convergence has not been achieved within maxiter
iterations of the acceleration scheme. If stoptype = "maxtime"
, then the algorithm will terminate if convergence has not been achieved within maxtime
seconds of running the acceleration scheme. Note: the running time of the acceleration scheme is calculated once every iteration. If the user desires different alternate stopping criteria than those implemented, stopfn.user
may be specified as a function with no inputs that maps to a logical taking the value TRUE which leads to the algorithm being terminated or the value FALSE which leads to the algorithm proceeding as usual.
convtype
A character string equal to
"parameter"
or"objfn"
."parameter"
indicates that the convergence criterion is a function of the current and previous value of the fixed point.objfn
indicates that the convergence criterion is a function of the current and previous value of the objective function.tol
A small, positive scalar that determines when convergence is achieved. See details above for convergence criteria currently implemented. Default is
1.e-07
.stoptype
A character string equal to
"maxiter"
or"maxtime"
that determines an alternative stopping rule for the algorithm. See details above for stopping rules currently implemented. Default is"maxiter"
.maxiter
If
stoptype = "maxiter"
, specifies the number of iterations after which the algorithm will be terminated if convergence has not been achieved. Default is 1500.maxtime
If
stoptype = "maxtime"
, specifies the running time (in seconds) after which the algorithm will be terminated if convergence has not been achieved. Default is 60.convfn.user
Optional, user-specified function for determining whether convergence has been achieved. Function should take as inputs
new
andold
, wherenew
is the current value (of the fixed point ifconvtype = "parameter"
and of the objective function value ifconvtype = "objfn"
) andold
is the previous value. Function should map to a logical taking the valueTRUE
if convergence is achieved (and hence the algorithm is terminated) and the valueFALSE
if convergence is not achieved. Default isNULL
.stopfn.user
Optional, user-specified function for determining whether to terminate the algorithm if convergence has not been achieved. See details above for how to specify. Default is
NULL
.trace
A logical variable denoting whether some of the intermediate results of iterations should be displayed to the user. Default is
FALSE
.keep.objfval
A logical variable denoting whether the objective function value at each iteration should be stored. Default is
FALSE
.keep.paramval
A logical variable denoting whether the parameter estimates at each iteration should be stored. Default is
FALSE
.
Value
turboem
returns an object of class turbo
. An object of class turbo
is a list containing at least the following components:
fail |
Vector of logical values whose |
value.objfn |
Vector of the value of the objective function |
itr |
Vector of the number of iterations completed for each algorithm. |
fpeval |
Vector of the number of fixed-point evaluations completed for each algorithm. |
objfeval |
Vector of the number of objective function evaluations completed for each algorithm. |
convergence |
Vector of logical values whose |
runtime |
Matrix whose |
errors |
Vector whose |
pars |
Matrix whose |
trace.objfval |
If |
trace.paramval |
If |
References
R Varadhan and C Roland (2008). Simple and globally convergent numerical schemes for accelerating the convergence of any EM algorithm. Scandinavian Journal of Statistics, 35:335-353.
A Berlinet and C Roland (2009). Parabolic acceleration of the EM algorithm. Stat Comput. 19 (1) 35-47.
Y He and C Liu (2010) The Dynamic ECME Algorithm. Technical Report. arXiv:1004.0524v1.
H Zhou, DH Alexander, and KL Lange (2011). A quasi-Newton acceleration for high-dimensional optimization algorithms. Stat Comput. 21 (2) 261-273.
See Also
Examples
###########################################################################
# Also see the vignette by typing:
# vignette("turboEM")
#
# EM algorithm for Poisson mixture estimation
fixptfn <- function(p,y) {
# The fixed point mapping giving a single E and M step of the EM algorithm
#
pnew <- rep(NA,3)
i <- 0:(length(y)-1)
zi <- p[1]*exp(-p[2])*p[2]^i / (p[1]*exp(-p[2])*p[2]^i + (1 - p[1])*exp(-p[3])*p[3]^i)
pnew[1] <- sum(y*zi)/sum(y)
pnew[2] <- sum(y*i*zi)/sum(y*zi)
pnew[3] <- sum(y*i*(1-zi))/sum(y*(1-zi))
p <- pnew
return(pnew)
}
objfn <- function(p,y) {
# Objective function whose local minimum is a fixed point
# negative log-likelihood of binary poisson mixture
i <- 0:(length(y)-1)
loglik <- y*log(p[1]*exp(-p[2])*p[2]^i/exp(lgamma(i+1)) +
(1 - p[1])*exp(-p[3])*p[3]^i/exp(lgamma(i+1)))
return ( -sum(loglik) )
}
# Real data from Hasselblad (JASA 1969)
poissmix.dat <- data.frame(death = 0:9,
freq = c(162,267,271,185,111,61,27,8,3,1))
y <- poissmix.dat$freq
# Use a preset seed so the example is reproducable.
require("setRNG")
old.seed <- setRNG(list(kind = "Mersenne-Twister", normal.kind = "Inversion",
seed = 54321))
p0 <- c(runif(1),runif(2,0,4)) # random starting value
# Basic EM algorithm, SQUAREM, and parabolic EM, with default settings
res1 <- turboem(par = p0, y = y, fixptfn = fixptfn, objfn = objfn,
method = c("EM", "squarem", "pem"))
# To apply the dynamic ECME (decme) acceleration scheme,
# we need to include a boundary function
boundary <- function(par, dr) {
lower <- c(0, 0, 0)
upper <- c(1, 10000, 10000)
low1 <- max(pmin((lower-par)/dr, (upper-par)/dr))
upp1 <- min(pmax((lower-par)/dr, (upper-par)/dr))
return(c(low1, upp1))
}
res2 <- turboem(par = p0, y = y, fixptfn = fixptfn, objfn = objfn,
boundary = boundary, method = c("EM", "squarem", "pem", "decme"))
# change some of the algorithm-specific default specifications (control.method),
# as well as the global control parameters (control.run)
res3 <- turboem(par = p0, y = y, fixptfn = fixptfn, objfn = objfn,
boundary = boundary, method = c("em", "squarem", "squarem", "decme", "qn", "qn"),
control.method = list(list(), list(K = 2), list(K = 3),
list(version = "v3"), list(qn = 1), list(qn = 2)),
control.run = list(tol = 1e-12, stoptype = "maxtime", maxtime = 1))
# Only the standard EM algorithm and SQUAREM *do not* require
# providing the objective function.
res4 <- turboem(par = p0, y = y, fixptfn = fixptfn,
method = c("em", "squarem", "squarem"),
control.method = list(list(), list(K = 1), list(K = 2)))
# If no objective function is provided, the "squarem" method defaults to Squarem-2
# Or, if control parameter K > 1, it defaults to Cyclem-2.
# Compare Squarem with and without objective function provided:
res5 <- turboem(par = p0, y = y, fixptfn = fixptfn, method = "squarem")
res5
res6 <- turboem(par = p0, y = y, fixptfn = fixptfn, objfn = objfn, method = "squarem")
res6