fit_t_env_ou {RPANDA} | R Documentation |
Maximum likelihood fit of the OU environmental model of trait evolution
Description
Fits Ornstein-Uhlenbeck (OU) model of trait evolution for which the optimum depends on an environmental function, or more generally a time varying function.
Usage
fit_t_env_ou(phylo, data, env_data, error=NULL, model,
method="Nelder-Mead", control=list(maxit=20000), ...)
Arguments
phylo |
An object of class 'phylo' (see ape documentation) |
data |
A named vector of phenotypic trait values. |
env_data |
Environmental data, given as a time continuous function (see, e.g. splinefun) or a data frame with two columns. The first column is time, the second column is the environmental data (temperature for instance). |
error |
A named vector with standard errors (SE) of trait values for each species (with names matching |
model |
A user defined model. If not provided, a default model is used (see details) |
method |
Methods used by the optimization routine (see ?optim for details). |
control |
Max. bound for the number of iteration of the optimizer; other options can be fixed on the list (see ?optim). |
... |
Arguments to be passed to the function. See details. |
Details
fit_t_env_ou
allows fitting OU-environmental models of trait evolution (Troyer et al. 2020, Goswami & Clavel 2024). Compared to model implemented in fit_t_env
where the rate of phenotypic evolution evolves as a function of an environmental variable (Clavel & Morlon 2020), here it's the optimum of a generalized Ornstein-Uhlenbeck (also called Hull-White model) that can changes as a function of an environmental variable T(t). More formally, the model is defined by the following process:
dX(t) = \alpha (\theta(t) -X(t))dt + \sigma dB(t)
Note that this model works only on NON-ULTRAMETRIC trees (e.g., with fossils)
The default model has the optimum changing as a function of environmental changes though times as defined below:
\theta (t) = \theta_0 + \beta T(t)
Users defined models should have the following form (see also examples below):
fun <- function(t, env, param, theta0){ theta0 + param*env(t)}
t: is the time parameter.
env: is a time function of an environmental variable.
See for instance object created by splinefun
when interpolating coordinate of points.
param: is a vector of parameters to estimate.
theta_0: is the state at the root of the tree.
For instance, the default model function can be coded as:
fun <- function(t, env, param, theta0){ theta0 + param[1]*env(t)}
where param[1]
is the \beta
parameter.
Note that in this case, one starting value should be provided in the param
argument.
e.g.:
beta=0
fit_t_env(tree, data, env_data=InfTemp, model=fun, param=beta)
The various options are passed through "...".
-param: The starting values used for the model. Must match the total number of parameters of the specified models. If "error=NA", a starting value for the SE to be estimated must be provided with user-defined models.
-scale: scale the amplitude of the environmental curve between 0 and 1. This may improve the parameters search in some situations.
-df: the degree of freedom to use for defining the spline. As a default, smooth.spline(env_data[,1], env_data[,2])$df is used. See sm.spline for details.
-upper: the upper bound for the parameter search when the "L-BFGS-B" method is used. See optim for details.
-lower: the lower bound for the parameter search when the "L-BFGS-B" method is used. See optim for details.
-maxdiff: difference in time between tips and present day for phylogenetic trees with no contemporaneous species (default is 0)
Value
a list with the following components
LH |
the maximum log-likelihood value |
aic |
the Akaike's Information Criterion |
aicc |
the second order Akaike’s Information Criterion |
free.parameters |
the number of estimated parameters |
param |
a numeric vector of estimated parameters, sigma and beta respectively for the defaults models. In the same order as defined by the user if a custom model is provided |
root |
the estimated root value |
convergence |
convergence status of the optimizing function; "0" indicates convergence (See ?optim for details) |
hess.value |
reliability of the likelihood estimates calculated through the eigen-decomposition of the hessian matrix. "0" means that a reliable estimate has been reached |
env_func |
the environmental function |
tot_time |
the root age of the tree |
model |
the fitted model (default models or user specified) |
nuisance |
the estimated SE for species mean when "error=NA" |
Note
The users defined function is evaluated forward in time i.e.: from the root to the tips (time = 0 at the (present) tips). The speed of convergence of the fit might depend on the degree of freedom chosen to define the spline.
Author(s)
J. Clavel
References
Clavel, J. & Morlon, H., 2017. Accelerated body size evolution during cold climatic periods in the Cenozoic. Proceedings of the National Academy of Science, 114(16): 4183-4188.
Troyer, E., Betancur-R, R., Hughes, L., Westneat, M., Carnevale, G., White W.T., Pogonoski, J.J., Tyler, J.C., Baldwin, C.C., Orti, G., Brinkworth, A., Clavel, J., Arcila, D., 2022 - The impact of paleoclimatic changes on body size evolution in marine fishes. Proceedings of the National Academy of Sciences, 119 (29), e2122486119.
Goswami, A. & Clavel, J., 2024. Morphological evolution in a time of Phenomics. EcoEvoRxiv, https://doi.org/10.32942/X22G7Q
See Also
plot.fit_t.env.ou
,sim_t_env_ou
Examples
data(InfTemp)
# Simulate a trait with temperature dependence of the optimum on a simulated tree
set.seed(9999) # for reproducibility
# Let's start by simulating a trait under a climatic OU
beta = 0.6 # relationship to the climate curve
sim_theta = 4 # value of the optimum if the relationship to the climate curve is 0
sim_sigma2 = 0.025 # variance of the scatter = sigma^2
sim_alpha = 0.36 # alpha value = strength of the OU; quite high here...
delta = 0.001 # time step used for the forward simulations => here its 1000y steps
tree <- phytools::pbtree(n=200, d=0.3) # simulate a bd tree with some extinct lineages
root_age = 60 # height of the root (almost all the Cenozoic here)
tree$edge.length <- root_age*tree$edge.length/max(phytools::nodeHeights(tree))
# here - for this contrived example - I scale the tree so that the root is at 60 Ma
trait <- sim_t_env_ou(tree, sigma=sqrt(sim_sigma2), alpha=sim_alpha, theta0=sim_theta,
param=beta, env_data=InfTemp, step=0.01, scale=TRUE, plot=TRUE)
## Fit the Environmental model (default)
result1 <- fit_t_env_ou(phylo = tree, data = trait, env_data =InfTemp,
method = "Nelder-Mead", df=50, scale=TRUE)
plot(result1)
## Fit user defined model (note that several other environmental variables
## can be simultaneously encapsulated in this function through the env argument)
# We re-define the function for the OU model with linear trend to the climatic curve
# NOTE: the env(t) function should return the value at the root for t=0
my_fun<-function(t, env, param, theta0){
theta0 + param[1]*env(t)
}
# starting value for param[1]. Here we use an arbitrary value of 0.1
beta_guess = 0.1
# fit the model
result2 <- fit_t_env_ou(phylo = tree, data = trait, env_data =InfTemp,
model = my_fun, param = beta_guess,
method = "Nelder-Mead", df=50, scale=TRUE)
# Retrieve the parameters and compare to 'result1'
result2
lines(result2, col="red", lty=2)
## Fit user defined environmental function
require(pspline)
spline_result <- sm.spline(x=InfTemp[,1],y=InfTemp[,2], df=50)
env_func <- function(t){predict(spline_result,t)}
t<-unique(InfTemp[,1])
# We build the interpolated smoothing spline function (not scaled here)
env_data<-splinefun(t,env_func(t))
# We then fit the model
result3 <- fit_t_env_ou(phylo = tree, data = trait, env_data = env_data,
model = my_fun, param = 0.01, method = "Nelder-Mead")