aster {aster} | R Documentation |
Aster Models
Description
Fits Aster Models.
Usage
aster(x, ...)
## Default S3 method:
aster(x, root, pred, fam, modmat, parm,
type = c("unconditional", "conditional"), famlist = fam.default(),
origin, origin.type = c("model.type", "unconditional", "conditional"),
method = c("trust", "nlm", "CG", "L-BFGS-B"), fscale, maxiter = 1000,
nowarn = TRUE, newton = TRUE, optout = FALSE, coef.names, ...)
## S3 method for class 'formula'
aster(formula, pred, fam, varvar, idvar, root,
data, parm, type = c("unconditional", "conditional"),
famlist = fam.default(),
origin, origin.type = c("model.type", "unconditional", "conditional"),
method = c("trust", "nlm", "CG", "L-BFGS-B"), fscale, maxiter = 1000,
nowarn = TRUE, newton = TRUE, optout = FALSE, ...)
Arguments
x |
an
|
root |
an object of the same shape as |
pred |
an integer vector of length |
fam |
an integer vector of length |
modmat |
an
|
parm |
usually missing. Otherwise a vector of length |
type |
type of model. The value of this argument can be abbreviated. |
famlist |
a list of family specifications (see |
origin |
Distinguished point in parameter space. May be missing,
in which case an unspecified default is provided. See details below
for further explanation. This is what |
origin.type |
Parameter space in which specified distinguished point
is located. If |
method |
optimization method. If |
fscale |
an estimate of the size of the log likelihood at the maximum.
Defaults to |
maxiter |
maximum number of iterations. Defaults to '1000'. |
nowarn |
if |
newton |
if |
optout |
if |
coef.names |
names of the regression coefficients. If missing,
|
... |
other arguments passed to the optimization method. |
formula |
a symbolic description of the model to be fit. See
|
varvar |
a variable of the same length as the response in
the formula that is a factor whose levels are character strings
treated as variable names. The number of variable names is |
idvar |
a variable of the same length as the response in
the formula that indexes individuals. The number
of individuals is |
data |
an optional data frame containing the variables
in the model. If not found in |
Details
The vector pred
must satisfy all(pred < seq(along = pred))
,
that is, each predecessor must precede in the order given in pred
.
The vector pred
defines a function p
.
The joint distribution of the data matrix x
is a product of conditionals
\prod_{i \in I} \prod_{j \in J} \Pr \{ X_{i j} | X_{i p(j)} \}
When p(j) = 0
, the notation X_{i p(j)}
means root[i, j]
. Other elements of the matrix root
are
not used.
The conditional distribution
\Pr \{ X_{i j} | X_{i p(j)} \}
is the X_{i p(j)}
-fold convolution of the j
-th family
in the vector fam
, a one-parameter exponential family
(i.e., the sum of X_{i p(j)}
i.i.d. terms having
this one-parameter exponential family distribution).
For type == "conditional"
the canonical parameter vector
\theta_{i j}
is modeled in GLM fashion as
\theta = a + M \beta
where M
is the model
matrix modmat
and a
is the distinguished point origin
.
Since the “vector” \theta
is
actually a matrix, the “matrix” M
must correspondingly
be a three-dimensional array. So \theta = a + M \beta
written out in full is
\theta_{i j} = a_{i j} + \sum_{k \in K} m_{i j k} \beta_k
This specifies the log likelihood.
For type == "unconditional"
the canonical parameter vector
for an unconditional model is modeled in GLM fashion as
\varphi = a + M \beta
(where the notation is as above).
The unconditional canonical parameters are then specified in terms of
the conditional ones by
\varphi_{i j} = \theta_{i j} - \sum_{k \in S(j)} \psi_k(\theta_{i k})
where S(j)
denotes the set of successors of j
,
the k
such that p(k) = j
, and \psi_k
is the
cumulant function for the k
-th exponential family.
This rather crazy looking formulation is an invertible change of parameter
and makes \varphi
the canonical parameter and x
the canonical statistic of a full
flat unconditional exponential family.
Again, this specifies the log likelihood.
In versions of aster prior to version 0.6 there was no a
in the model
specification, which is the same as specifying a = 0
in the current
specification. If a
is in the column space of the model matrix, that
is, if there exists an \alpha
such that a = M \alpha
, then there is no difference
in the model specified with a
and the one with a = 0
.
The maximum likelihood regression coefficients \hat{\beta}
will be different, but the maximum likelihood estimates of all other
parameters (conditional and unconditional, canonical and mean value)
will be the same. This is the usual case and explains why “linear”
models (with a = 0
) as opposed to “affine” models
(with general a
)
are popular. In the unusual case where a
is not in the column space
of the design matrix, then affine models are a generalization of linear
models: the two are not equivalent, their maximum likelihood estimates are
not the same in any parameterization.
In order to use the R model formula mini-language we must flatten
the dimensionality, making the model matrix modmat
two-dimensional
(a true matrix). This must be done as if by
matrix(modmat, ncol = ncoef)
,
which imposes the requirements on varvar
and idvar
given in the arguments section: they must look like row(x)
and
col(x)
modulo relabeling.
Then x
and root
become one-dimensional, done as if by as.numeric(x)
and as.numeric(root)
.
The standard way to do this in R is to use the reshape
function on a data frame in which the columns of the x
matrix
are variables in the data frame. reshape
automatically puts
things in the right order and creates varvar
and idvar
.
This is shown in the examples section below and in the package vignette
titled "Aster Package Tutorial".
Value
aster
returns an object of class inheriting from "aster"
.
aster.formula
, returns an object of class "aster"
and
subclass "aster.formula"
.
The function summary
(i.e., summary.aster
) can
be used to obtain or print a summary of the results, the function
anova
(i.e., anova.aster
)
to produce an analysis of deviance table, and the function
predict
(i.e., predict.aster
)
to produce predicted values and standard errors.
An object of class "aster"
is a list containing at least the
following components:
coefficients |
a named vector of coefficients. |
rank |
the numeric rank of the fitted generalized linear model
part of the aster model (i.e., the rank of |
deviance |
up to a constant, minus twice the maximized log-likelihood. (Note the minus. This is somewhat counterintuitive, but cannot be changed for reasons of backward compatibility.) |
iter |
the number of iterations used by the optimization method. |
converged |
logical. Was the optimization algorithm judged to have converged? |
code |
integer. The convergence code returned by the optimization method. |
gradient |
The gradient vector of minus the log likelihood at the
fitted |
hessian |
The Hessian matrix of minus the log likelihood
(i.e., the observed Fisher information) at the
fitted |
fisher |
Expected Fisher information at the fitted |
optout |
The object returned by the optimization routine
( |
Calls to aster.formula
return a list also containing:
call |
the matched call. |
formula |
the formula supplied. |
terms |
the |
data |
the |
NA Values
It was almost always wrong for aster model data to have NA
values.
Although theoretically possible for the R formula mini-language to do the
right thing for an aster model with NA
values in the data, usually
it does some wrong thing. Thus, since version 0.8-20, this function and
the reaster
function give errors when used with data having
NA
values. Users must remove all NA
values (or replace them
with what they should be, perhaps zero values) “by hand”.
Directions of Recession
Even if the result of this function
has component converges
equal to TRUE
, the result will be
nonsense if there are one or more directions of recession. These are
not detected by this function, but rather by the summary
function
applied to the result of this function,
for which see summary.aster
.
References
Geyer, C. J., Wagenius, S., and Shaw, R. G. (2007) Aster Models for Life History Analysis. Biometrika, 94, 415–426. doi:10.1093/biomet/asm030.
Shaw, R. G., Geyer, C. J., Wagenius, S., Hangelbroek, H. H., and Etterson, J. R. (2008) Unifying Life History Analysis for Inference of Fitness and population growth. American Naturalist, 172, E35–E47. doi:10.1086/588063.
See Also
anova.aster
,
summary.aster
,
and
predict.aster
Examples
### see package vignette for explanation ###
library(aster)
data(echinacea)
vars <- c("ld02", "ld03", "ld04", "fl02", "fl03", "fl04",
"hdct02", "hdct03", "hdct04")
redata <- reshape(echinacea, varying = list(vars), direction = "long",
timevar = "varb", times = as.factor(vars), v.names = "resp")
redata <- data.frame(redata, root = 1)
pred <- c(0, 1, 2, 1, 2, 3, 4, 5, 6)
fam <- c(1, 1, 1, 1, 1, 1, 3, 3, 3)
hdct <- grepl("hdct", as.character(redata$varb))
redata <- data.frame(redata, hdct = as.integer(hdct))
level <- gsub("[0-9]", "", as.character(redata$varb))
redata <- data.frame(redata, level = as.factor(level))
aout <- aster(resp ~ varb + level : (nsloc + ewloc) + hdct : pop,
pred, fam, varb, id, root, data = redata)
summary(aout, show.graph = TRUE)