predict.aster {aster}  R Documentation 
Obtains predictions (parameter estimates) and optionally estimates standard errors of those predictions (parameter estimates) from a fitted Aster model object.
## S3 method for class 'aster' predict(object, x, root, modmat, amat, parm.type = c("mean.value", "canonical"), model.type = c("unconditional", "conditional"), is.always.parameter = FALSE, se.fit = FALSE, info = c("expected", "observed"), info.tol = sqrt(.Machine$double.eps), newcoef = NULL, gradient = se.fit, ...) ## S3 method for class 'aster.formula' predict(object, newdata, varvar, idvar, root, amat, parm.type = c("mean.value", "canonical"), model.type = c("unconditional", "conditional"), is.always.parameter = FALSE, se.fit = FALSE, info = c("expected", "observed"), info.tol = sqrt(.Machine$double.eps), newcoef = NULL, gradient = se.fit, ...)
object 
a fitted object of class inheriting from 
modmat 
a model matrix to use instead of May be missing, in which case

x 
response. Ignored and may be missing unless

root 
root data. Ignored and may be missing unless

amat 
if For For 
parm.type 
the type of parameter to predict. The default is
mean value parameters (the opposite of the default
for The alternative The value of this argument can be abbreviated. 
model.type 
the type of model in which to predict. The default is
The value of this argument can be abbreviated. 
is.always.parameter 
logical, default 
se.fit 
logical switch indicating if standard errors are required. 
info 
the type of Fisher information to use to compute standard errors. 
info.tol 
tolerance for eigenvalues of Fisher information.
If 
newdata 
optionally, a data frame in which to look for variables with
which to predict. If omitted, see 
varvar 
a variable of length 
idvar 
a variable of length 
newcoef 
if not 
gradient 
if 
... 
further arguments passed to or from other methods. 
Note that model.type
need have nothing to do with the type
of the fitted aster model, which is object$type
.
Whether the
fitted model is conditional or unconditional, one typically wants
unconditional mean value parameters, because conditional mean
value parameters for hypothetical individuals depend on the hypothetical
data x
, which usually makes no scientific sense.
If one asks for conditional mean value parameters, one gets
them only if is.always.parameter = TRUE
is specified.
Otherwise, conditional expectations that are not parameters (because
they depend on data) are produced.
See Conditional Mean Values Section for more about this.
Similarly, if object$type == "conditional"
, then the conditional
canonical parameters are a linear function of the regression coefficients
theta = M beta, where M is the model matrix,
but one can predict either theta or the unconditional
canonical parameters phi,
as selected by model.type
.
Similarly, if object$type == "unconditional"
,
so phi = M beta, one can predict either
theta or phi
as selected by model.type
.
The specification of the prediction model is confusing because there
are so many possibilities. First the “usual” case.
The fit was done using a formula, found in object$formula
.
A data frame newdata
that has the same variables as object$data
,
the data frame used in the fit, but may have different rows (representing
hypothetical individuals) is supplied.
But newdata
must specify all nodes
of the graphical model for each (hypothetical, new) individual,
just like object$data
did for real observed individuals.
Hence newdata
is typically constructed using reshape
.
See also the details section of aster
.
In this “usual” case we need varvar
and idvar
to
tell us what rows of newdata
correspond to which individuals and
nodes (the same role they played in the original fit by aster
).
If we are predicting canonical parameters, then we do not need root
or
x
.
If we are predicting unconditional mean value parameters, then
we also need root
but not x
.
If we are predicting conditional mean value parameters, then
we also need both root
and x
.
In the “usual” case, these are found in newdata
and
not supplied as arguments to predict
. Moreover, x
is not named "x"
but is the response in out$formula
.
The next case, predict(object)
with no other arguments,
is often used with linear models (predict.lm
),
but we expect will be little used for aster models. As for linear
models, this “predicts” the observed data. In this case
modmat
, x
, and root
are found in object
and nothing is supplied as an argument to predict.aster
, except
perhaps amat
if one wants a function of predictions for the observed
data.
The final case, also perhaps little used, is a failsafe mode for problems
in which the R formula language just cannot be bludgeoned into doing what
you want. This is the same reason aster.default
exists.
Then a model matrix can be constructed “by hand”, and the function
predict.aster
is used instead of predict.aster.formula
.
Note that it is possible to use a “constructed by hand”
model matrix even if object
was produced by aster.formula
.
Simply explicitly call predict.aster
rather than predict
to override the R method dispatch (which would
call predict.aster.formula
in this case).
If se.fit = FALSE
and gradient = FALSE
, a vector of predictions.
If se.fit = TRUE
, a list with components
fit 
Predictions 
se.fit 
Estimated standard errors 
gradient 
The gradient of the transformation from regression coefficients to predictions 
If gradient = TRUE
, a list with components
fit 
Predictions 
gradient 
The gradient of the transformation from regression coefficients to predictions 
Both the original aster paper (Geyer, et al., 2007) and this package are weird about conditional mean values. Equation (10) of that paper defines (using different notation from what we use here)
xi[j] = E(x[j]  x[p(j)])
where x[j] are components of the response vector and p(j) denotes denotes the predecessor of node j. That paper explicitly says that this is is not a parameter because it depends on the data. In fact
E(x[j]  x[p(j)]) = x[p(j)] E(x[j]  x[p(j)] = 1)
(this is equation (3) of that paper in different notation). Thus it is weird to use a Greek letter to denote this.
There should be a conditional mean value parameter, and Geyer (2010, equation (11b)) defines it as
xi[j] = E(y[j]  y[p(j)] = 1)
(This equation only makes sense when the conditioning event x[p(j)] = 1 is possible, which it is not for ktruncated arrows for k > 0. Then a more complicated definition must be used. By definition x[j] is the sum of x[p(j)] independent and identically distributed random variables, and xi[j] is always the mean of one of those random variables.) This gives us the important relationship between conditional and unconditional mean value parameters
mu[j] = xi[j] mu[p(j)]
which holds for all successor nodes j.
All later writings of this author use this definition of xi
as does the R package aster2
(Geyer, 2017).
This is one of six important parameterizations of an unconditional aster model
(Geyer, 2010, Sections 2.7 and 2.8). The R package aster2
uses all
of them.
This function (as of version 1.0 of this package) has an argument
is.always.parameter
to switch between these two definitions
in case parm.type = "mean.value"
and model.type = "conditional"
are specified. Then is.always.parameter = TRUE
specifies that the latter
definition of xi is produced (which is a parameter, with all other
options for parm.type
and model.type
). The option
is.always.parameter = FALSE
specifies that the former
definition of xi is produced (which is a conditional expectation
but not a parameter) and is what this function produced in versions of this
package before 1.0.
Geyer, C. J. (2010) A Philosophical Look at Aster Models. Technical Report No. 676. School of Statistics, University of Minnesota. http://purl.umn.edu/57163.
Geyer, C.~J. (2017).
R package aster2
(Aster Models), version 0.3.
https://cran.rproject.org/package=aster2.
Geyer, C. J., Wagenius, S., and Shaw, R. G. (2007) Aster models for life history analysis. Biometrika, 94, 415–426. doi: 10.1093/biomet/asm030.
### see package vignette for explanation ### library(aster) data(echinacea) vars < c("ld02", "ld03", "ld04", "fl02", "fl03", "fl04", "hdct02", "hdct03", "hdct04") redata < reshape(echinacea, varying = list(vars), direction = "long", timevar = "varb", times = as.factor(vars), v.names = "resp") redata < data.frame(redata, root = 1) pred < c(0, 1, 2, 1, 2, 3, 4, 5, 6) fam < c(1, 1, 1, 1, 1, 1, 3, 3, 3) hdct < grepl("hdct", as.character(redata$varb)) redata < data.frame(redata, hdct = as.integer(hdct)) level < gsub("[09]", "", as.character(redata$varb)) redata < data.frame(redata, level = as.factor(level)) aout < aster(resp ~ varb + level : (nsloc + ewloc) + hdct : pop, pred, fam, varb, id, root, data = redata) newdata < data.frame(pop = levels(echinacea$pop)) for (v in vars) newdata[[v]] < 1 newdata$root < 1 newdata$ewloc < 0 newdata$nsloc < 0 renewdata < reshape(newdata, varying = list(vars), direction = "long", timevar = "varb", times = as.factor(vars), v.names = "resp") hdct < grepl("hdct", as.character(renewdata$varb)) renewdata < data.frame(renewdata, hdct = as.integer(hdct)) level < gsub("[09]", "", as.character(renewdata$varb)) renewdata < data.frame(renewdata, level = as.factor(level)) nind < nrow(newdata) nnode < length(vars) amat < array(0, c(nind, nnode, nind)) for (i in 1:nind) amat[i , grep("hdct", vars), i] < 1 foo < predict(aout, varvar = varb, idvar = id, root = root, newdata = renewdata, se.fit = TRUE, amat = amat) bar < cbind(foo$fit, foo$se.fit) dimnames(bar) < list(as.character(newdata$pop), c("Estimate", "Std. Error")) print(bar)