pickle {aster}  R Documentation 
Evaluates an approximation to minus the log likelihood for an aster model with random effects. Uses Laplace approximation to integrate out the random effects analytically. The “quasi” in the title is a misnomer in the context of aster models but the acronym PQL for this procedure is wellestablished in the generalized linear mixed models literature.
pickle(sigma, parm, fixed, random, obj, y, origin, cache, ...) makezwz(sigma, parm, fixed, random, obj, y, origin) pickle1(sigma, parm, fixed, random, obj, y, origin, cache, zwz, deriv = 0, ...) pickle2(alphasigma, parm, fixed, random, obj, y, origin, cache, zwz, deriv = 0, ...) pickle3(alphaceesigma, fixed, random, obj, y, origin, zwz, deriv = 0)
sigma 
vector of square roots of variance components,
one component for each group of random effects. Negative values are
allowed; the vector of variance components is 
parm 
starting value for inner optimization. Ignored if

alphasigma 
the concatenation of the vector of fixed effects and the vector of square roots of variance components. 
alphaceesigma 
the concatenation of the vector of fixed effects, the vector of rescaled random effects, and the vector of square roots of variance components. 
fixed 
the model matrix for fixed effects. The number of rows
is 
random 
the model matrix or matrices for random effects.
The number of rows is 
obj 
aster model object, the result of a call to 
y 
response vector. May be omitted, in which case 
origin 
origin of aster model. May be omitted, in which case
default origin (see 
cache 
If not missing, an environment in which to cache the value
of 
zwz 
A possible value of t(Z) W Z, where Z is the model matrix for all random effects and W is the variance matrix of the response. 
deriv 
Number of derivatives wanted. For 
... 
additional arguments passed to 
Define
p(alpha, c, sigma) = m(a + M alpha + Z A c) + t(c) c / 2 + log det[A t(Z) What Z A + I] / 2
where m is minus the log likelihood function of a saturated aster model,
a is a known vector (the offset vector in the terminology
of glm
but the origin in the terminology
of aster
), M is a known matrix, the model matrix for
fixed effects (the argument fixed
of these functions),
Z is a known matrix, the model matrix for random effects
(either the argument random
of these functions if it is a matrix or
Reduce(cbind, random)
if random
is a list of matrices),
A is a diagonal matrix whose diagonal is the vector
rep(sigma, times = nrand)
where nrand
is sapply(random, ncol)
when random
is a list of
matrices and ncol(random)
when random
is a matrix,
What is any symmetric positive semidefinite matrix
(more on this below),
and I is the identity matrix.
Note that A is a function of sigma although the
notation does not explicitly indicate this.
Let cstar denote the minimizer of p(α, c, σ) considered as a function of c for fixed alpha and sigma, and let alphatwiddle and ctwiddle denote the (joint) minimizers of p(α, c, σ) considered as a function of alpha and c for fixed sigma. Note that cstar is a function of alpha and sigma although the notation does not explicitly indicate this. Note that alphatwiddle and ctwiddle are functions of sigma (only) although the notation does not explicitly indicate this. Now define
q(alpha, sigma) = p(alpha, cstar, sigma)
and
r(sigma) = p(alphatwiddle, cstar, sigma)
Then pickle1
evaluates r(sigma),
pickle2
evaluates q(alpha, sigma), and
pickle3
evaluates p(alpha, c, sigma),
where t(Z) What Z in the formulas above is
specified by the argument zwz
of these functions.
All of these functions supply derivative (gradient) vectors if
deriv = 1
is specified, and pickle3
supplies the
second derivative (Hessian) matrix if deriv = 2
is specified.
Let W denote the second derivative function of m, that is,
W(phi) is the second derivative matrix of the function
m evaluated at the point phi. The idea is that
What should be approximately the value of
W(a + M alphahat + Z Ahat chat),
where alphahat, chat,
and sigmahat are the (joint) minimizers of p(alpha, c, sigma)
and Ahat = A(sigmahat).
In aid of this, the function makezwz
evaluates
t(Z) W(a + M alpha + Z A c) Z
for any alpha, c, and sigma.
pickle
evaluates the function
s(sigma) = m(a + M alphatwiddle + Z A ctwiddle) + t(ctwiddle) ctwiddle / 2 + log det[A t(Z) W(a + M alphatwiddle + Z A ctwiddle) Z A + I]
no derivatives can be computed because no derivatives of the function W are computed for aster models.
The general idea is the one uses pickle
with a noderivative
optimizer, such as the "NelderMead"
method of the optim
function to get a crude estimate of sigmahat.
Then one uses trust
with objective
function penmlogl
to estimate the corresponding
alphahat and chat (example below).
Then one use makezwz
to produce the corresponding zwz
(example below).
These estimates can be improved using trust
with objective
function pickle3
using this zwz
(example below), and this
step may be iterated until convergence.
Finally, optim
is used with objective function pickle2
to estimate the Hessian matrix of q(alpha, sigma),
which is approximate observed information because
q(alpha, sigma) is approximate minus log likelihood.
For pickle
, a scalar, minus the
(PQL approximation of) the log likelihood.
For pickle1
and pickle2
, a list having components value
and gradient
(present only when deriv = 1
).
For pickle3
, a list having components value
,
gradient
(present only when deriv >= 1
),
and hessian
(present only when deriv = 2
).
Not intended for use by naive users. Use reaster
,
which calls them.
data(radish) library(trust) pred < c(0,1,2) fam < c(1,3,2) ### need object of type aster to supply to penmlogl and pickle aout < aster(resp ~ varb + fit : (Site * Region + Block + Pop), pred, fam, varb, id, root, data = radish) ### model matrices for fixed and random effects modmat.fix < model.matrix(resp ~ varb + fit : (Site * Region), data = radish) modmat.blk < model.matrix(resp ~ 0 + fit:Block, data = radish) modmat.pop < model.matrix(resp ~ 0 + fit:Pop, data = radish) rownames(modmat.fix) < NULL rownames(modmat.blk) < NULL rownames(modmat.pop) < NULL idrop < match(aout$dropped, colnames(modmat.fix)) idrop < idrop[! is.na(idrop)] modmat.fix < modmat.fix[ ,  idrop] nfix < ncol(modmat.fix) nblk < ncol(modmat.blk) npop < ncol(modmat.pop) ### try penmlogl sigma.start < c(1, 1) alpha.start < aout$coefficients[match(colnames(modmat.fix), names(aout$coefficients))] parm.start < c(alpha.start, rep(0, nblk + npop)) tout < trust(objfun = penmlogl, parm.start, rinit = 1, rmax = 10, sigma = sigma.start, fixed = modmat.fix, random = list(modmat.blk, modmat.pop), obj = aout) tout$converged ### crude estimate of variance components eff.blk < tout$argument[seq(nfix + 1, nfix + nblk)] eff.pop < tout$argument[seq(nfix + nblk + 1, nfix + nblk + npop)] sigma.crude < sqrt(c(var(eff.blk), var(eff.pop))) ### try optim and pickle cache < new.env(parent = emptyenv()) oout < optim(sigma.crude, pickle, parm = tout$argument, fixed = modmat.fix, random = list(modmat.blk, modmat.pop), obj = aout, cache = cache) oout$convergence == 0 ### estimated variance components oout$par^2 ### get estimates of fixed and random effects tout < trust(objfun = penmlogl, tout$argument, rinit = 1, rmax = 10, sigma = oout$par, fixed = modmat.fix, random = list(modmat.blk, modmat.pop), obj = aout, fterm = 0) tout$converged sigma.better < oout$par alpha.better < tout$argument[1:nfix] c.better < tout$argument[ (1:nfix)] zwz.better < makezwz(sigma.better, parm = c(alpha.better, c.better), fixed = modmat.fix, random = list(modmat.blk, modmat.pop), obj = aout) ### get better estimates objfun < function(alphaceesigma, zwz) pickle3(alphaceesigma, fixed = modmat.fix, random = list(modmat.blk, modmat.pop), obj = aout, zwz = zwz, deriv = 2) tout < trust(objfun, c(alpha.better, c.better, sigma.better), rinit = 1, rmax = 10, zwz = zwz.better) tout$converged alpha.mle < tout$argument[1:nfix] c.mle < tout$argument[nfix + 1:(nblk + npop)] sigma.mle < tout$argument[nfix + nblk + npop + 1:2] zwz.mle < makezwz(sigma.mle, parm = c(alpha.mle, c.mle), fixed = modmat.fix, random = list(modmat.blk, modmat.pop), obj = aout) ### estimated variance components sigma.mle^2 ### preceding step can be iterated "until convergence" ### get (approximate) Fisher information objfun < function(alphasigma) pickle2(alphasigma, parm = c.mle, fixed = modmat.fix, random = list(modmat.blk, modmat.pop), obj = aout, zwz = zwz.mle)$value gradfun < function(alphasigma) pickle2(alphasigma, parm = c.mle, fixed = modmat.fix, random = list(modmat.blk, modmat.pop), obj = aout, zwz = zwz.mle, deriv = 1)$gradient oout < optim(c(alpha.mle, sigma.mle), objfun, gradfun, method = "BFGS", hessian = TRUE) oout$convergence == 0 fish < oout$hessian