fit_bm_model {castor} | R Documentation |

Given a rooted phylogenetic tree and states of one or more continuous (numeric) traits on the tree's tips, fit a multivariate Brownian motion model of correlated co-evolution of these traits. This estimates a single diffusivity matrix, which describes the variance-covariance structure of each trait's random walk. The model assumes a fixed diffusivity matrix on the entire tree. Optionally, multiple trees can be used as input, under the assumption that the trait evolved on each tree according to the same BM model.

fit_bm_model( trees, tip_states, isotropic = FALSE, Nbootstraps = 0, check_input = TRUE)

`trees` |
Either a single rooted tree or a list of rooted trees, of class "phylo". The root of each tree is assumed to be the unique node with no incoming edge. Edge lengths are assumed to represent time intervals or a similarly interpretable phylogenetic distance. |

`tip_states` |
Numeric state of each trait at each tip in each tree. If |

`isotropic` |
Logical, specifying whether diffusion should be assumed to be isotropic (i.e., independent of the direction). Hence, if |

`Nbootstraps` |
Integer, specifying the number of parametric bootstraps to perform for calculating the confidence intervals. If <=0, no bootstrapping is performed. |

`check_input` |
Logical, specifying whether to perform some basic checks on the validity of the input data. If you are certain that your input data are valid, you can set this to |

The BM model is defined by the stochastic differential equation

*
dX = σ \cdot dW
*

where *W* is a multidimensional Wiener process with Ndegrees independent components and *σ* is a matrix of size Ntraits x Ndegrees. Alternatively, the same model can be defined as a Fokker-Planck equation for the probability density *ρ*:

*
\frac{\partial ρ}{\partial t} = ∑_{i,j}D_{ij}\frac{\partial^2ρ}{\partial x_i\partial x_j}.
*

The matrix *D* is referred to as the diffusivity matrix (or diffusion tensor), and *2D=σ\cdotσ^T*. Note that *σ* can be obtained from *D* by means of a Cholesky decomposition.

The function uses phylogenetic independent contrasts (Felsenstein, 1985) to retrieve independent increments of the multivariate random walk. The diffusivity matrix *D* is then fitted using maximum-likelihood on the intrinsic geometry of positive-definite matrices. If multiple trees are provided as input, then independent contrasts are extracted from all trees and combined into a single set of independent contrasts (i.e., as if they had been extracted from a single tree).

If `tree$edge.length`

is missing, each edge in the tree is assumed to have length 1. The tree may include multifurcations (i.e. nodes with more than 2 children) as well as monofurcations (i.e. nodes with only one child). Note that multifurcations are internally expanded to bifurcations, prior to model fitting.

A list with the following elements:

`success` |
Logical, indicating whether the fitting was successful. If |

`diffusivity` |
Either a single non-negative number (if |

`loglikelihood` |
The log-likelihood of the fitted model, given the provided tip states data. |

`Ncontrasts` |
Integer, number of independent contrasts used to estimate the diffusivity. This corresponds to the number of independent data points used. |

`standard_errors` |
Either a single numeric or a 2D numeric matrix of size Ntraits x Ntraits, listing the estimated standard errors of the estimated diffusivity, based on parametric bootstrapping. Only returned if |

`CI50lower` |
Either a single numeric or a 2D numeric matrix of size Ntraits x Ntraits, listing the lower bounds of the 50% confidence interval for the estimated diffusivity (25-75% percentile), based on parametric bootstrapping. Only returned if |

`CI50upper` |
Either a single numeric or a 2D numeric matrix of size Ntraits x Ntraits, listing the upper bound of the 50% confidence interval for the estimated diffusivity, based on parametric bootstrapping. Only returned if |

`CI95lower` |
Either a single numeric or a 2D numeric matrix of size Ntraits x Ntraits, listing the lower bound of the 95% confidence interval for the estimated diffusivity (2.5-97.5% percentile), based on parametric bootstrapping. Only returned if |

`CI95upper` |
Either a single numeric or a 2D numeric matrix of size Ntraits x Ntraits, listing the upper bound of the 95% confidence interval for the estimated diffusivity, based on parametric bootstrapping. Only returned if |

`consistency` |
Numeric between 0 and 1, estimated consistency of the data with the fitted model. If |

Stilianos Louca

J. Felsenstein (1985). Phylogenies and the Comparative Method. The American Naturalist. 125:1-15.

A. Lindholm, D. Zachariah, P. Stoica, T. B. Schoen (2019). Data consistency approach to model validation. IEEE Access. 7:59788-59796.

`simulate_bm_model`

,
`get_independent_contrasts`

# generate a random tree tree = generate_random_tree(list(birth_rate_intercept=1), 10000)$tree # Example 1: Scalar case # - - - - - - - - - - - - - - # simulate scalar continuous trait on the tree D = 1 tip_states = simulate_bm_model(tree, diffusivity=D)$tip_states # estimate original diffusivity from the generated data fit = fit_bm_model(tree, tip_states) cat(sprintf("True D=%g, fitted D=%g\n",D,fit$diffusivity)) # Example 2: Multivariate case # - - - - - - - - - - - - - - - # simulate vector-valued continuous trait on the tree D = get_random_diffusivity_matrix(Ntraits=5) tip_states = simulate_bm_model(tree, diffusivity=D)$tip_states # estimate original diffusivity matrix from the generated data fit = fit_bm_model(tree, tip_states) # compare true and fitted diffusivity matrices cat("True D:\n"); print(D) cat("Fitted D:\n"); print(fit$diffusivity)

[Package *castor* version 1.6.8 Index]