R: Evolutionary Trait Divergence Between Sister Species and...

diverge-package {diverge}

R Documentation

Evolutionary Trait Divergence Between Sister Species and Other Paired Lineages

Description

This package can be used to study the dynamics of continuous trait divergence in sister species and other paired lineages. The primary aim is to characterize the evolutionary processes responsible for continuous trait differentiation between lineages following their departure from a common ancestor. Alternative mechanisms of trait divergence such as divergent selection, drift, and stabilizing selection are expected to leave unique signatures in the distribution of trait differentiation in datasets comprised of many species pairs. The package uses maximum likelihood to estimate parameters and compare the fits of alternative process-based models to such distributions. Key functions for most users will be 'model_select' and 'find_mle'. See 'details' below for a description of the modelling framework, data requirements, and other considerations.

Details

Package:	diverge
Type:	Package
Version:	2.0.6
Date:	2022-12-14
License:	GPL (>= 2)

BASIC MODELS:
Alternative mechanisms of trait divergence are approximated here by stochastic process models. We use Brownian Motion (BM) models to approximate divergence under random drift or fluctuating selection; Ornstein-Uhlenbeck (OU) models in which two diverging lineages share an optima to approximate divergence under parallel and/or shared stabilizing selection; and OU models in which two diverging lineages are pulled toward alternative optima to approximate divergent adaptation (this last suite of models we refer to as "DA" models). In the DA models, the difference between optima of paired lineages (psi) measures the extent of divergent selection. A more detailed explanation of the model variants used in this package is provided in the model_select man page.

MIXTURE MODELS:
Of course, different pairs in any given dataset may be diverging under different processes. We thus include mixture models (NEW TO VERSION 2.0) in which pairs in a dataset can diverge under one of two processes, and the proportion of pairs diverging under a given processes is estimated as a model parameter. The two basic mixture models are DA-OU and OU-BM, and the proportion parameter returned is that for the more complex model (i.e. Proportion DA in a DA-OU model and Proportion OU in an OU-BM model). A DA-BM model is also possible and is encoded in these functions, but including this model in comparative tests leads to higher error rates in model selection, and the model behaviour is often biologically unrealistic, so we recommend against its usage (tests to be released in supp.mat of Anderson et al. 2021 in review).

MODEL EXTENSIONS:
(1) Effect of continuous variables on model parameters.
Rates of evolution, patterns of divergent selection, and proportions of pairs diverging under a given process (in mixture models) may vary across continuous biological or ecological gradients such as latitude, elevation, and body mass. We follow the approach used in the EvoRAG package (Weir 2014), whereby model parameters are represented as linear functions of the continuous gradient. Model fitting requires, as input, the gradient value for each pair.
(2) Effect of categorical variables on model parameters.
Just as patterns of divergence can vary across continuous ecological gradients, they can also vary across categorical variables such as relative geographic range (i.e. allopatric versus sympatric), and different pollinator types. We model this variation by allowing parameters to vary among the user-defined categories. Tests of categorical models require as input a vector containing the category of each pair (coded as 0, 1, 2, for a maximum of 3 categories).
(3) Discrete temporal change in extent of divergent selection.
In models of divergent selection (DA models), lineages in a pair are pulled toward alternative adaptive optima following their initial divergence from an ancestor. Often one or both of these lineages will experience a secondary shift in their adaptive environment which changes the position of their optima. This manifests in the DA models as a discrete shift in psi, referred to as an "epoch shift". WAIT-TIME models (DA_wt, DA_wt_linear) estimate the timing of this shift as a wait time shared by all pairs in the dataset. BREAKPOINT models (DA_bp, DA_bp_ilnear) require users to provide a set of breakpoint values, which are the times after divergence at which a shift in psi is hypothesized to occur. These will be unique to pairs of different ages. Pairs in which no shift is expected to have occurred are assigned breakpoint value of zero. See the model_select description of DA_bp for more details.

DATA REQUIREMENTS:
diverge is designed to work with datasets of lineage-pair contrasts in a continuous trait. At minimum, datasets must contain two components for each lineage pair: (1) a measure of the difference between the two lineages in a trait, and (2) an estimate of time since the two lineages initially diverged. Extended versions of models require additional info on the gradient value or categorical value for each pair. Power and model performance increase with the number of lineage pairs in a dataset. Several functions in this package use the same arguments for required data. These include:

div - a vector of trait differences for each pair in the dataset. Required for all evolutionary models. These are calculated for each pair as abs(trait_val_lineage_2 - trait_val_lineage1). Raw values (i.e. not absolute values) can also be used but this must be noted by the user in the absolute argument of most functions.

ages - a vector containing the age (i.e. estimated time since divergence) for each pair in the dataset.

me1 - measurement error (standard error of mean) in species 1s of each pair (variance/n-measurements)

me1 - measurement error (standard error of mean) in species 2s of each pair (variance/n-measurements)

GRAD - a vector containing the gradient position of each pair. This is the value of a continuous variable such as latitude or body size across which parameters are hypothesized to vary. Required for all models with the 'linear' suffix.

cats - a vector containing the category code (0, 1, or 2) for each pair in the dataset. Required when using the DA-cat model only.

breakpoint - a vector of breakpoint times for each pair in the dataset. These are the times after divergence at which a shift occurs in the psi parameter of a DA model. Required for DA_bp and DA_bp_linear. See model_select for details on how to calculate.

IMPORTANT NOTE:
div, ages, ms1, ms2, GRAD, cats, and breakpoint vectors must be aligned such that div[i], age[i], grad[i], cat[i], and breakpoint[i] represent values for the same i'th pair.

PARALLELIZATION:
Many functions in 'diverge' can be run in parallel across multiple cores through the 'parallel' logical argument. This functionality uses forking and is therefore unavailable on windows machines. All examples shown in function manuals are run in serial. We encourage users to insert parallel=TRUE in these commands where resources allow.

CUSTOMIZATION:
The code for several key functions has been lightly commented to aid users wishing to hack functions for custom use.

Author(s)

Sean A. S. Anderson and Jason T. Weir

Maintainer: Sean A.S. Anderson <sean.as.anderson@gmail.com>

[Package diverge version 2.0.6 Index]