geographic_acf {castor} | R Documentation |

Given a rooted phylogenetic tree and geographic coordinates (latitudes & longitudes) of each tip, calculate the phylogenetic autocorrelation function (ACF) of the geographic locations. The ACF is a function of phylogenetic distance x, i.e., ACF(x) is the autocorrelation between two tip locations conditioned on the tips having phylogenetic ("patristic") distance x.

geographic_acf( trees, tip_latitudes, tip_longitudes, Npairs = 10000, Nbins = NULL, min_phylodistance = 0, max_phylodistance = NULL, uniform_grid = FALSE, phylodistance_grid = NULL)

`trees` |
Either a single rooted tree of class "phylo", or a list of multiple such trees. |

`tip_latitudes` |
Either a numeric vector of size Ntips (if |

`tip_longitudes` |
Similar to |

`Npairs` |
Maximum number of random tip pairs to draw from each tree. A greater number of tip pairs will improve the accuracy of the estimated ACF within each distance bin. Tip pairs are drawn randomly with replacement, if |

`Nbins` |
Number of phylogenetic distance bins to consider. A greater number of bins will increase the resolution of the ACF as a function of phylogenetic distance, but will decrease the number of tip pairs falling within each bin (which reduces the accuracy of the estimated ACF). If |

`min_phylodistance` |
Numeric, minimum phylogenetic distance to conssider. Only relevant if |

`max_phylodistance` |
Numeric, optional maximum phylogenetic distance to consider. If |

`uniform_grid` |
Logical, specifying whether the phylodistance grid should be uniform, i.e., with equally sized phylodistance bins. If |

`phylodistance_grid` |
Numeric vector, optional explicitly specified phylodistance bins (left boundaries thereof) on which to evaluate the ACF. Must contain non-negative numbers in strictly ascending order. Hence, the first bin will range from |

The autocorrelation between random geographic locations is defined as the expectation of *<X,Y>*, where <> is the scalar product and *X* and *Y* are the unit vectors pointing towards the two random locations on the sphere. For comparison, for a spherical Brownian Motion model with constant diffusivity *D* and radius *r* the autocorrelation function is given by *ACF(t)=e^{-2Dt/r^2}* (see e.g. `simulate_sbm`

). Note that this function assumes that Earth is a perfect sphere.

The phylogenetic autocorrelation function (ACF) of the geographic distribution of species can give insight into the dispersal processes shaping species distributions over global scales. An ACF that decays slowly with increasing phylogenetic distance indicates a strong phylogenetic conservatism of the location and thus slow dispersal, whereas a rapidly decaying ACF indicates weak phylogenetic conservatism and thus fast dispersal. Similarly, if the mean distance between two random tips increases with phylogenetic distance, this indicates a phylogenetic autocorrelation of species locations. Here, phylogenetic distance between tips refers to their patristic distance, i.e. the minimum cumulative edge length required to connect the two tips.

Since the phylogenetic distances between all possible tip pairs do not cover a continuoum (as there is only a finite number of tips), this function randomly draws tip pairs from the tree, maps them onto a finite set of phylodistance bins and then estimates the ACF for the centroid of each bin based on tip pairs in that bin. In practice, as a next step one would usually plot the estimated ACF (returned vector `autocorrelations`

) over the centroids of the phylodistance bins (returned vector `phylodistances`

). When multiple trees are provided as input, then the ACF is first calculated separately for each tree, and then averaged across trees (weighted by the number of tip pairs included from each tree in each bin).

Phylogenetic distance bins can be specified in two alternative ways: Either a set of bins (phylodistance grid) is automatically calculated based on the provided `Nbins`

, `min_phylodistance`

, `max_phylodistance`

and `uniform_grid`

, or a phylodistance grid is explicitly provided via `phylodistance_grid`

and `max_phylodistance`

.

The trees may include multi-furcations (i.e. nodes with more than 2 children) as well as mono-furcations (i.e. nodes with only one child). If edge lengths are missing from the trees, then every edge is assumed to have length 1. The input trees must be rooted at some node for technical reasons (see function `root_at_node`

), but the choice of the root node does not influence the result.

This function assumes that each tip is assigned exactly one geographic location. This might be problematic in situations where each tip covers multiple geographic locations, for example if tips are species and multiple individuals were sampled from each species. In that case, one might consider representing each individual as a separate tip in the tree, so that each tip has exactly one geographic location.

A list with the following elements:

`success` |
Logical, indicating whether the calculation was successful. If |

`phylodistances` |
Numeric vector of size Nbins, storing the center of each phylodistance bin in increasing order. This is equal to |

`left_phylodistances` |
Numeric vector of size Nbins, storing the left boundary of each phylodistance bin in increasing order. |

`right_phylodistances` |
Numeric vector of size Nbins, storing the right boundary of each phylodistance bin in increasing order. |

`autocorrelations` |
Numeric vector of size Nbins, storing the estimated geographic autocorrelation for each phylodistance bin. |

`std_autocorrelations` |
Numeric vector of size Nbins, storing the standard deviation of geographic autocorrelations encountered in each phylodistance bin. Note that this is not the standard error of the estimated ACF; it is a measure for how different the geographic locations are between tip pairs within each phylodistance bin. |

`mean_geodistances` |
Numeric vector of size Nbins, storing the mean geographic distance between tip pairs in each distance bin, in units of sphere radii. If you want geographic distances in km, you need to multiply these by Earth's mean radius in km (about 6371). If multiple input trees were provided, this is the average across all trees, weighted by the number of tip pairs included from each tree in each bin. |

`std_geodistances` |
Numeric vector of size Nbins, storing the standard deviation of geographic distances between tip pairs in each distance bin, in units of sphere radii. |

`Npairs_per_distance` |
Integer vector of size Nbins, storing the number of random tip pairs associated with each distance bin. |

Stilianos Louca

`get_trait_depth`

,
`get_trait_acf`

# generate a random tree tree = generate_random_tree(list(birth_rate_intercept=1),max_tips=1000)$tree # simulate spherical Brownian Motion on the tree simul = simulate_sbm(tree, radius=1, diffusivity=0.1) tip_latitudes = simul$tip_latitudes tip_longitudes = simul$tip_longitudes # calculate geographical autocorrelation function ACF = geographic_acf(tree, tip_latitudes, tip_longitudes, Nbins=10) # plot ACF (autocorrelation vs phylogenetic distance) plot(ACF$phylodistances, ACF$autocorrelations, type="l", xlab="distance", ylab="ACF")

[Package *castor* version 1.6.8 Index]