simulate_deterministic_hbd {castor} R Documentation

## Simulate a deterministic homogenous birth-death model.

### Description

Given a homogenous birth-death (HBD) model, i.e., with speciation rate λ, extinction rate μ and sampling fraction ρ, calculate various deterministic features of the model backwards in time, such as the total diversity over time. The speciation and extinction rates may be time-dependent. “Homogenous” refers to the assumption that, at any given moment in time, all lineages exhibit the same speciation/extinction rates (in the literature this is sometimes referred to simply as “birth-death model”; Morlon et al. 2011). “Deterministic” refers to the fact that all calculated properties are completely determined by the model's parameters (i.e. non-random), as if an infinitely large tree was generated (aka. “continuum limit”).

Alternatively to λ, one may provide the pulled diversification rate (PDR; Louca et al. 2018) and the speciation rate at some fixed age, λ(τ_o). Similarly, alternatively to μ, one may provide the ratio of extinction over speciation rate, μ/λ. In either case, the time-profiles of λ, μ, μ/λ or the PDR are specified as piecewise polynomial functions (splines), defined on a discrete grid of ages.

### Usage

simulate_deterministic_hbd(LTT0,
oldest_age,
age0           = 0,
rho0           = 1,
age_grid       = NULL,
lambda         = NULL,
mu             = NULL,
mu_over_lambda = NULL,
PDR            = NULL,
lambda0        = NULL,
splines_degree = 1,
relative_dt    = 1e-3,
allow_unreal   = FALSE)


### Arguments

 LTT0 The assumed number of sampled extant lineages at age0, defining the necessary initial condition for the simulation. If the HBD model is supposed to describe a specific timetree, then LTT0 should correspond to the number of lineages in the tree ("lineages through time") at age age0. oldest_age Strictly positive numeric, specifying the oldest time before present (“age”) to include in the simulation. age0 Non-negative numeric, specifying the age at which LTT0, lambda0 and rho are given. Typically this will be 0, i.e., corresponding to the present. rho0 Numeric between 0 (exclusive) and 1 (inclusive), specifying the sampling fraction of the tree at age0, i.e. the fraction of lineages extant at age0 that are included in the tree (aka. "rarefaction"). Note that if rho0<1, lineages extant at age0 are assumed to have been sampled randomly at equal probabilities. Can also be NULL, in which case rho0=1 is assumed. age_grid Numeric vector, listing discrete ages (time before present) on which either λ and μ, or the PDR and μ, are specified. Listed ages must be strictly increasing, and must cover at least the full considered age interval (from age0 to oldest_age). Can also be NULL or a vector of size 1, in which case the speciation rate, extinction rate and PDR are assumed to be time-independent. lambda Numeric vector, of the same size as age_grid (or size 1 if age_grid==NULL), listing speciation rates (λ, in units 1/time) at the ages listed in age_grid. Speciation rates should be non-negative, and are assumed to vary polynomially between grid points (see argument splines_degree). If NULL, then PDR and lambda0 must be provided. mu Numeric vector, of the same size as age_grid (or size 1 if age_grid==NULL), listing extinction rates (μ, in units 1/time) at the ages listed in age_grid. Extinction rates should be non-negative, and are assumed to vary polynomially between grid points (see argument splines_degree). Either mu or mu_over_lambda must be provided, but not both. mu_over_lambda Numeric vector, of the same size as age_grid (or size 1 if age_grid==NULL), listing the ratio of extinction rates over speciation rates (μ/λ) at the ages listed in age_grid. These ratios should be non-negative, and are assumed to vary polynomially between grid points (see argument splines_degree). Either mu or mu_over_lambda must be provided, but not both. PDR Numeric vector, of the same size as age_grid (or size 1 if age_grid==NULL), listing pulled diversification rates (in units 1/time) at the ages listed in age_grid. PDRs can be negative or positive, and are assumed to vary polynomially between grid points (see argument splines_degree). If NULL, then lambda must be provided. lambda0 Non-negative numeric, specifying the speciation rate (in units 1/time) at age0. Either lambda0 or lambda must be provided, but not both. splines_degree Integer, either 0,1,2 or 3, specifying the polynomial degree of the provided lambda, mu and PDR between grid points in age_grid. For example, if splines_degree==1, then the provided lambda, mu and PDR are interpreted as piecewise-linear curves; if splines_degree==2 they are interpreted as quadratic splines; if splines_degree==3 they are interpreted as cubic splines. The splines_degree influences the analytical properties of the curve, e.g. splines_degree==1 guarantees a continuous curve, splines_degree==2 guarantees a continuous curve and continuous derivative, and so on. relative_dt Strictly positive numeric (unitless), specifying the maximum relative time step allowed for integration over time. Smaller values increase integration accuracy but increase computation time. Typical values are 0.0001-0.001. The default is usually sufficient. allow_unreal Logical, specifying whether HBD models with unrealistic parameters (e.g., negative μ) should be supported. This may be desired for example when examining model congruence classes with negative μ.

### Details

This function supports the following alternative parameterizations of HBD models:

• Using the speciation rate λ and extinction rate μ.

• Using the speciation rate λ and the ratio μ/λ.

• Using the pulled diversification rate (PDR), the extinction rate and the speciation rate given at some fixed age0 (i.e. lambda0).

• Using the PDR, the ratio μ/λ and the speciation rate at some fixed age0.

The PDR is defined as PDR = λ-μ+λ^{-1}dλ/dτ, where τ is age (time before present). To avoid ambiguities, only one of the above parameterizations is accepted at a time. The sampling fraction at age0 (i.e., rho0) should always be provided; setting it to NULL is equivalent to setting it to 1.

Note that in the literature the sampling fraction usually refers to the fraction of lineages extant at present-day that have been sampled (included in the tree); this present-day sampling fraction is then used to parameterize birth-death cladogenic models. The sampling fraction can however be generalized to past times, by defining it as the probability that a lineage extant at any given time is included in the tree. The simulation function presented here allows for specifying this generalized sampling fraction at any age of choice, not just present-day.

The simulated LTT refers to a hypothetical tree sampled at age age_grid[1], i.e. LTT(t) will be the number of lineages extant at age t that survived until age age_grid[1] and have been sampled, given that the fraction of sampled extant lineages at age0 is rho0. Similarly, the returned Pextinct(t) (see below) is the probability that a lineage extant at age t would not survive until age_grid[1]. The same convention is used for the returned variables Pmissing, shadow_diversity, PER, PSR, SER and PND.

### Value

A named list with the following elements:

 success Logical, indicating whether the calculation was successful. If FALSE, then the returned list includes an additional 'error' element (character) providing a description of the error; all other return variables may be undefined. ages Numerical vector of size NG, listing discrete ages (time before present) on which all returned time-curves are specified. Listed ages will be in ascending order, will cover exactly the range age_grid[1] - oldest_age, may be irregularly spaced, and may be finer than the original provided age_grid. Note that ages[1] corresponds to the latest time point (closer to the tips), while ages[NG] corresponds to the oldest time point (oldest_age). total_diversity Numerical vector of size NG, listing the predicted (deterministic) total diversity (number of extant species, denoted N) at the ages given in ages[]. shadow_diversity Numerical vector of size NG, listing the predicted (deterministic) “shadow diversity” at the ages given in ages[]. The shadow diversity is defined as N_s=N\cdot ρ(τ_o)λ(τ_o)/λ, where τ_o is age0. Pmissing Numeric vector of size NG, listing the probability that a lineage, extant at a given age, will be absent from the tree either due to extinction or due to incomplete sampling. Pextinct Numeric vector of size NG, listing the probability that a lineage, extant at a given age, will be fully extinct at present. Note that always Pextinct<=Pmissing. LTT Numeric vector of size NG, listing the number of lineages represented in the tree at any given age, also known as “lineages-through-time” (LTT) curve. Note that LTT at age0 will be equal to LTT, and that values in LTT will be non-increasing with age. lambda Numeric vector of size NG, listing the speciation rate (in units 1/time) at the ages given in ages[]. mu Numeric vector of size NG, listing the extinction rate (in units 1/time) at the ages given in ages[]. diversification_rate Numeric vector of size NG, listing the net diversification rate (λ-μ) at the ages given in ages[]. PDR Numeric vector of size NG, listing the pulled diversification rate (PDR, in units 1/time) at the ages given in ages[]. PND Numeric vector of size NG, listing the pulled normalized diversity (PND, in units 1/time) at the ages given in ages[]. The PND is defined as PND=(N/N(τ_o))\cdotλ(τ_o)/λ. SER Numeric vector of size NG, listing the “shadow extinction rate” (SER, in units 1/time) at the ages given in ages[]. The SER is defined as SER=ρ(τ_o)λ(τ_o)-PDR. PER Numeric vector of size NG, listing the “pulled extinction rate” (PER, in units 1/time) at the ages given in ages[]. The PER is defined as SER=λ(τ_o)-PDR (Louca et al. 2018). PSR Numeric vector of size NG, listing the “pulled speciation rate” (PSR, in units 1/time) at the ages given in ages[]. The PSR is defined as PSR=λ\cdot(1-Pmissing). rholambda0 Non-negative numeric, specifying the product of the sampling fraction and the speciation rate at age0, ρ\cdotλ(τ_o).

Stilianos Louca

### References

H. Morlon, T. L. Parsons, J. B. Plotkin (2011). Reconciling molecular phylogenies with the fossil record. Proceedings of the National Academy of Sciences. 108:16327-16332.

S. Louca et al. (2018). Bacterial diversification through geological time. Nature Ecology & Evolution. 2:1458-1467.

loglikelihood_hbd

### Examples

# define an HBD model with exponentially decreasing speciation/extinction rates
Ntips     = 1000
beta      = 0.01 # exponential decay rate of lambda over time
oldest_age= 10
age_grid  = seq(from=0,to=oldest_age,by=0.1) # choose a sufficiently fine age grid
lambda    = 1*exp(beta*age_grid) # define lambda on the age grid
mu        = 0.2*lambda # assume similarly shaped but smaller mu

# simulate deterministic HBD model
simulation = simulate_deterministic_hbd(LTT0       = Ntips,
oldest_age = oldest_age,
rho0       = 0.5,
age_grid   = age_grid,
lambda     = lambda,
mu         = mu)

# plot deterministic LTT
plot( x = simulation$ages, y = simulation$LTT, type='l',
main='dLTT', xlab='age', ylab='lineages')


[Package castor version 1.7.0 Index]