reconstruct_past_diversification {castor} | R Documentation |

Given a time series of past diversities (coalescent or not), this function estimates instantaneous birth (speciation) and death (extinction) rates that would lead to the observed diversity time series. The function is based on a deterministic model (or the continuum limit of a stochastic cladogenic model), in which instantaneous birth and death rates lead to a predictable growth of a tree (one new species per birth event). The reconstruction is non-parametric, i.e. does not rely on fitting a parameterized model. The reconstruction is only accurate in the deterministic limit, i.e. for high diversities where the stochastic nature of the cladogenic process diminishes. Of particular importance is the case where the time series is coalescent, i.e. represents the diversity (lineages-through-time) that would be represented in a coalescent tree with extinctions.

Note: This function is included for legacy reasons mainly. In most cases users should instead use the functions `fit_hbd_model_on_grid`

and `fit_hbd_model_parametric`

to fit birth-death models, or the functions `fit_hbd_pdr_on_grid`

, `fit_hbd_pdr_parametric`

and `fit_hbd_psr_on_grid`

to fit BD model congruence classes (aka. “pulled variables”) to a tree.

reconstruct_past_diversification( times, diversities, birth_rates_pc = NULL, rarefaction = NULL, discovery_fractions = NULL, discovery_fraction_slopes = NULL, max_age = NULL, coalescent = FALSE, smoothing_span = 0, smoothing_order = 1)

`times` |
Numeric vector, listing the times at which diversities are given. Values must be in ascending order. |

`diversities` |
Numeric vector of the same size as |

`birth_rates_pc` |
Numeric vector of the same size as |

`rarefaction` |
Numeric between 0 and 1. Optional rarefaction fraction assumed for the diversities at the very end. Set to 1 to assume no rarefaction was performed. |

`discovery_fractions` |
Numeric array of size Ntimes, listing the fractions of extant lineages represented in the tree over time. Hence, |

`discovery_fraction_slopes` |
Numeric array of size Ntimes, listing the 1st derivative of |

`max_age` |
Numeric. Optional maximum distance from the end time to be considered. If |

`coalescent` |
Logical, indicating whether the provided diversities are from a coalescent tree (only including clades with extant representatives) or total diversities (extant species at each time point). |

`smoothing_span` |
Non-negative integer. Optional sliding window size (number of time points) for smoothening the diversities time series via Savitzky-Golay-filter. If <=2, no smoothing is done. Smoothening the time series can reduce the effects of noise on the reconstructed diversity dynamics. |

`smoothing_order` |
Integer between 1 and 4. Polynomial order of the Savitzky-Golay smoothing filter to be applied. Only relevant if |

This function can be used to fit a birth-death model to a coalescent diversity time series *N_c(τ)* at various ages *τ*, also known as “lineages-through-time” curve. The reconstruction of the total diversity *N(τ)* is based on the following formulas:

*
E(τ)=1+\frac{ν(τ)}{β(τ)},\\
*

*
N(τ)=\frac{N_c}{1-E(τ)},
*

*
ν(τ)=\frac{1}{N_c(τ)}\frac{dN_c(τ)}{dτ}
*

where *E(τ)* is the probability that a clade of size 1 at age *τ* went extinct by the end of the time series and *β* is the per-capita birth rate. If the per-capita birth rate is not explicitly provided for each time point (see argument `birth_rate_pc`

), the function assumes that the per-capita birth rate (speciation rate) is constant at all times. If `birth_rates_pc==NULL`

and `coalescent==TRUE`

, the constant speciation rate is estimated as

*
β = -\frac{ν(0)}{ρ},
*

where *ρ* is the fraction of species kept after rarefaction (see argument `rarefaction`

).

Assuming a constant speciation rate may or may not result in accurate estimates of past total diversities and other quantities. If a time-varying speciation rate is suspected but not known, additional information on past diversification dynamics may be obtained using modified (“pulled”) quantities that partly resemble the classical extinction rate, diversification rate and total diversity. Such quantities are the “pulled diversification rate”:

*
η(τ) = δ(τ) - β(τ) + \frac{1}{β(τ)}\frac{dβ}{dτ},
*

the “pulled extinction rate”:

*
δ_p(τ) = δ(τ) + (β_o-β(τ)) - \frac{1}{β(τ)}\frac{dβ}{dτ},
*

and the “pulled total diversity”:

*
N_p(τ) = N(τ)\cdot\frac{β_o}{β(τ)},
*

where *β_o* is the provided or estimated (if not provided) speciation rate at the last time point. The advantage of these quantities is that they can be estimated from the coalescent diversities (lineages-through-time) without any assumptions on how *β* and *δ* varied over time. The disadvantage is that they differ from their “non-pulled” quantities (*β-δ*, *δ* and *N*), in cases where *β* varied over time.

A named list with the following elements:

`success` |
Logical, specifying whether the reconstruction was successful. If |

`Ntimes` |
Integer. Number of time points for which reconstruction is returned. |

`total_diversities` |
Numeric vector of the same size as |

`coalescent_diversities` |
Numeric vector of the same size as |

`birth_rates` |
Numeric vector of the same size as |

`death_rates` |
Numeric vector of the same size as |

`Psurvival` |
Numeric vector of the same size as |

`Pdiscovery` |
Numeric vector of the same size as |

`Prepresentation` |
Numeric vector of the same size as |

`total_births` |
Numeric, giving the estimated total number of birth events that occurred between times |

`total_deaths` |
Numeric, giving the estimated total number of death events that occurred between times |

`last_birth_rate_pc` |
The provided or estimated (if not provided) speciation rate at the last time point. This corresponds to the birth rate divided by the estimated true diversity (prior to rarefaction) at the last time point. |

`last_death_rate_pc` |
The estimated extinction rate at the last time point. This corresponds to the death rate divided by the estimated true diversity (prior to rarefaction) at the last time point. |

`pulled_diversification_rates` |
Numeric vector of the same size as |

`pulled_extinction_rates` |
Numeric vector of the same size as |

`pulled_total_diversities` |
Numeric vector of the same size as |

Stilianos Louca

Louca et al (2018). Bacterial diversification through geological time. Nature Ecology & Evolution. 2:1458-1467.

`generate_random_tree`

,
`fit_tree_model`

,
`count_lineages_through_time`

,
`fit_hbd_model_parametric`

,
`fit_hbd_model_on_grid`

##################################################### # EXAMPLE 1 # Generate a coalescent tree params = list(birth_rate_intercept = 0, birth_rate_factor = 1, birth_rate_exponent = 1, death_rate_intercept = 0, death_rate_factor = 0.05, death_rate_exponent = 1.3, rarefaction = 1) simulation = generate_random_tree(params,max_time_eq=1,coalescent=TRUE) tree = simulation$tree time_span = simulation$final_time - simulation$root_time cat(sprintf("Generated tree has %d tips, spans %g time units\n",length(tree$tip.label),time_span)) # Calculate diversity time series from the tree counter = count_lineages_through_time(tree, times=seq(0,0.99*time_span,length.out=100)) # print coalescent diversities print(counter$lineages) # reconstruct diversification dynamics based on diversity time series results = reconstruct_past_diversification( counter$times, counter$lineages, coalescent = TRUE, smoothing_span = 3, smoothing_order = 1) # print reconstructed total diversities print(results$total_diversities) # plot coalescent and reconstructed true diversities matplot(x = counter$times, y = matrix(c(counter$lineages,results$total_diversities), ncol=2, byrow=FALSE), type = "b", xlab = "time", ylab = "# clades", lty = c(1,2), pch = c(1,0), col = c("red","blue")) legend( "topleft", legend = c("coalescent (simulated)","true (reconstructed)"), col = c("red","blue"), lty = c(1,2), pch = c(1,0)); ##################################################### # EXAMPLE 2 # Generate a non-coalescent tree params = list(birth_rate_intercept = 0, birth_rate_factor = 1, birth_rate_exponent = 1, death_rate_intercept = 0, death_rate_factor = 0.05, death_rate_exponent = 1.3, rarefaction = 1) simulation = generate_random_tree(params,max_time_eq=1,coalescent=FALSE) tree = simulation$tree time_span = simulation$final_time - simulation$root_time cat(sprintf("Generated tree has %d tips, spans %g time units\n",length(tree$tip.label),time_span)) # Calculate diversity time series from the tree counter = count_lineages_through_time(tree, times=seq(0,0.99*time_span,length.out=100)) # print true diversities print(counter$lineages) # reconstruct diversification dynamics based on diversity time series results = reconstruct_past_diversification( counter$times, counter$lineages, birth_rates_pc = params$birth_rate_factor, coalescent = FALSE, smoothing_span = 3, smoothing_order = 1) # print coalescent diversities print(results$coalescent_diversities) # plot coalescent and reconstructed true diversities matplot(x = counter$times, y = matrix(c(results$coalescent_diversities,counter$lineages), ncol=2, byrow=FALSE), type = "b", xlab = "time", ylab = "# clades", lty = c(1,2), pch = c(1,0), col = c("red","blue")) legend( "topleft", legend = c("coalescent (reconstructed)","true (simulated)"), col = c("red","blue"), lty = c(1,2), pch = c(1,0));

[Package *castor* version 1.6.8 Index]