aster2-package {aster2} | R Documentation |
Aster Models
Description
Aster models are exponential family graphical models that combine aspects of generalized linear models and survival analysis.
This package is still under development, only about half finished.
However, it does do maximum likelihood for unconditional aster models
with dependence groups, which the old package aster
does not.
The main differences between this package and the old package are as follows.
The old package had triple indices for model matrices. The first index ran over individuals, the second index over nodes of the graph for an individual, and the third index over regression coefficients. Consequently the model matrix was represented (sometimes, but not consistently) as a three-dimensional array rather than a matrix, which was very confusing, even to the package author. This package ignores individuals, one index runs over all nodes of the combined graph for all individuals. Thus model matrices are always matrices.
The old package did not implement dependence groups, although they were described in Geyer, Wagenius and Shaw (2007). This package does. Consequently, this package requires a data frame, a vector
pred
that indicates predecessors, a vectorgroup
that indicates individuals in the same dependence group, and a vectorfam
that indicates families to specify a saturated aster model (the old package required only the data frame,pred
, andfam
). To facilitate the old style model specification, there is a new functionasterdata
that constructs objects of class"asterdata"
given an old style data frame,pred
, andfam
. All other functions of the package take objects of class"asterdata"
as model specifications.The function
predict.aster
in the old package did some parameter transformations, but not all, and the returned value, when a list, had a componentgradient
, that was undocumented but useful in applying the delta method. The functionstransformSaturated
,transformConditional
, andtransformUnconditional
in this package transform between any of the following parameter vectors: the conditional canonical parameter\theta
, the unconditional canonical parameter\varphi
, the conditional mean value parameter\xi
, the unconditional mean value parameter\mu
, the canonical affine submodel canonical parameter\beta
, and (unconditional aster models only) the canonical affine submodel mean value parameter\tau
(this last parameter is new, not discussed in the cited papers below, it is\tau = M^T \mu
, whereM
is the model matrix). The change of parameter from\tau
to\beta
is equivalent to maximum likelihood estimation for an unconditional aster model when the value\tau = M^T y
is used, wherey
is the response vector. All of these transformation functions also compute derivatives, if requested. See examples.
Bugs
Functions analogous to aster
, anova
, and predict
in the old package are missing, thus model fitting, hypothesis tests,
and confidence intervals are more cumbersome. In fact, since there is
no function to calculate log likelihoods (like mlogl
in the old
package), there is no way to do likelihood ratio tests (but Rao or Wald
tests could be done, for unconditional aster models, since the derivative
of the log likelihood is observed minus expected
M^T (y - \mu)
.
References
Geyer, C. J., Wagenius, S., and Shaw, R. G. (2007) Aster Models for Life History Analysis. Biometrika 94 415–426.
Shaw, R. G., Geyer, C. J., Wagenius, S., Hangelbroek, H. H. and Etterson, J. R. (2008) Unifying Life History Analyses for Inference of Fitness and Population Growth. American Naturalist, 172, E35–E47.
See Also
asterdata
, transformSaturated
,
families
Examples
## Not run: # perfectly good example but takes longer to run than CRAN allows
data(echinacea)
#### estimate MLE (simpler model than in Biometrika paper cited, not as good)
hdct <- as.numeric(grepl("hdct", as.character(echinacea$redata$varb)))
modmat <- model.matrix(resp ~ varb + nsloc + ewloc + pop * hdct - pop,
data = echinacea$redata)
tau.hat <- as.numeric(t(modmat) %*% echinacea$redata$resp)
beta.hat <- transformUnconditional(tau.hat, modmat, echinacea,
from = "tau", to = "beta")
inverse.fisher <- jacobian(tau.hat, echinacea, transform = "unconditional",
from = "tau", to = "beta", modmat = modmat)
#### now have MLE (beta.hat) and pseudo-inverse of Fisher information
#### (inverse.fisher), pseudo-inverse because modmat is not full rank
foo <- cbind(beta.hat, sqrt(diag(inverse.fisher)))
foo <- cbind(foo, foo[ , 1]/foo[ , 2])
foo <- cbind(foo, 2 * pnorm(- abs(foo[ , 3])))
dimnames(foo) <- list(colnames(modmat),
c("Estimate", "Std. Error", "z value", "Pr(>|z|)"))
printCoefmat(foo)
#### coefficients constrained to be zero because parameterization is not
#### identifiable have estimate zero and std. error zero (and rest NA)
#### estimate fitness in populations
#### generate new data with one individual in each pop at location (0, 0)
pop.names <- levels(echinacea$redata$pop)
pop.idx <- match(pop.names, as.character(echinacea$redata$pop))
pop.id <- echinacea$redata$id[pop.idx]
newdata <- subset(echinacea, echinacea$redata$id %in% pop.id)
newdata$redata[ , "nsloc"] <- 0
newdata$redata[ , "ewloc"] <- 0
hdct <- as.integer(grepl("hdct", as.character(newdata$redata$varb)))
#### modmat for new data
newmodmat <- model.matrix(resp ~ varb + nsloc + ewloc + pop * hdct - pop,
data = newdata$redata)
#### matrix that when multiplied mean value parameter vector gives fitness
#### in each pop
amat <- matrix(NA, nrow = length(pop.id), ncol = nrow(newmodmat))
for (i in 1:nrow(amat))
amat[i, ] <- as.numeric(grepl(paste("^", pop.id[i], ".hdct", sep = ""),
rownames(newmodmat)))
#### transform to expected fitness parameters
efit <- transformUnconditional(beta.hat, newmodmat, newdata,
from = "beta", to = "mu")
efit <- as.numeric(amat %*% efit)
#### jacobian matrix of this transformation
jack <- jacobian(beta.hat, newdata, transform = "unconditional",
from = "beta", to = "mu", modmat = newmodmat)
#### delta method standard errors
sefit <- sqrt(diag(amat %*% jack %*% inverse.fisher %*% t(jack) %*% t(amat)))
foo <- cbind(efit, sefit)
dimnames(foo) <- list(pop.names, c("Est. fitness", "Std. Error"))
print(foo)
## End(Not run)