R: Phylogenetic Regression using a Cauchy Process

cauphylm {cauphy}

R Documentation

Phylogenetic Regression using a Cauchy Process

Description

Perform a phylogenetic regression using the Cauchy Process, by numerical optimization.

Usage

cauphylm(
  formula,
  data = list(),
  phy,
  model = c("cauchy", "lambda"),
  lower.bound = list(disp = 0, lambda = 0),
  upper.bound = list(disp = Inf, lambda = NULL),
  starting.value = list(disp = NULL, lambda = NULL),
  hessian = FALSE
)

Arguments

`formula`	a model formula.
`data`	a data frame containing variables in the model. If not found in data, the variables are taken from current environment.
`phy`	a phylogenetic tree of class `phylo`.
`model`	a model for the trait evolution. One of `"cauchy"` or `"lambda"` (see Details).
`lower.bound`	named list with lower bound values for the parameters. See Details for the default values.
`upper.bound`	named list with upper bound values for the parameters. See Details for the default values.
`starting.value`	named list initial values for the parameters. See Details for the default values.
`hessian`	if `TRUE`, then the numerical hessian is computed, for confidence interval computations. See `compute_vcov`.

Details

This function fits a Cauchy Process on the phylogeny, using maximum likelihood and the "fixed.root" method (see fitCauchy). It further assumes that the root value x0 is a linear combination of the covariables in formula. The corresponding regression model is:

Y = X \beta + E,

with:

Y: the vector of traits at the tips of the tree;
X: the regression matrix of covariables in formula;
\beta: the vector of coefficients;
E: a centered error vector that is Cauchy distributed, and can be seen as the result of a Cauchy process starting at 0 at the root, and with a dispersion disp (see fitCauchy).

Unless specified by the user, the initial values for the parameters are taken according to the following heuristics:

coefficients:: \beta are obtained from a robust regression using lmrob.S;
disp:: is initialized from the trait centered and normalized by tip heights, with one of the following statistics, taken from Rousseeuw & Croux 1993:

IQR:: half of the inter-quartile range (see IQR);
MAD:: median absolute deviation with constant equal to 1 (see mad);
Sn:: Sn statistics with constant 0.7071 (see Sn);
Qn:: Qn statistics with constant 1.2071 (see Qn).

Unless specified by the user, disp is taken positive unbounded.

The function uses nloptr for the numerical optimization of the (restricted) likelihood, computed with function logDensityTipsCauchy. It uses algorithms BOBYQA and MLSL_LDS for local and global optimization.

If model="lambda", the CP is fit on a tree with branch lengths re-scaled using the Pagel's lambda transform (see transf.branch.lengths), and the lambda value is estimated using numerical optimization. The default initial value for the lambda parameter is computed using adequate robust moments. The default maximum value is computed using phytools:::maxLambda, and is the ratio between the maximum height of a tip node over the maximum height of an internal node. This can be larger than 1. The default minimum value is 0.

Value

`coefficients`	the named vector of estimated coefficients.
`disp`	the maximum likelihood estimate of the dispersion parameter.
`logLik`	the maximum of the log likelihood.
`p`	the number of all parameters of the model.
`aic`	AIC value of the model.
`fitted.values`	fitted values
`residuals`	raw residuals
`y`	response
`X`	design matrix
`n`	number of observations (tips in the tree)
`d`	number of dependent variables
`formula`	the model formula
`call`	the original call to the function
`model`	the phylogenetic model for the covariance
`phy`	the phylogenetic tree
`lambda`	the ml estimate of the lambda parameter (for `model="lambda"`)

References

Bastide, P. and Didier, G. 2023. The Cauchy Process on Phylogenies: a Tractable Model for Pulsed Evolution. Systematic Biology. doi:10.1093/sysbio/syad053.

Rothenberg T. J., Fisher F. M., Tilanus C. B. 1964. A Note on Estimation from a Cauchy Sample. Journal of the American Statistical Association. 59:460–463.

Rousseeuw P.J., Croux C. 1993. Alternatives to the Median Absolute Deviation. Journal of the American Statistical Association. 88:1273–1283.

Examples

# Simulate tree and data
set.seed(1289)
phy <- ape::rphylo(20, 0.1, 0)
error <- rTraitCauchy(n = 1, phy = phy, model = "cauchy",
                      parameters = list(root.value = 0, disp = 0.1))
x1 <- ape::rTraitCont(phy, model = "BM", sigma = 0.1, root.value = 0)
trait <- 3 + 2*x1 + error
# Fit the data
fit <- cauphylm(trait ~ x1, phy = phy)
fit
# Approximate confidence intervals
confint(fit)

[Package cauphy version 1.0.2 Index]