medusa {geiger} | R Documentation |
MEDUSA: modeling evolutionary diversification using stepwise AIC
Description
Fits piecewise birth-death models to ultrametric phylogenetic tree(s) according to phylogenetic (edge-length) and taxonomic (richness) likelihoods. Optimal model size is determined via a stepwise AIC approach.
Usage
medusa(phy, richness = NULL, criterion = c("aicc", "aic"),
partitions=NA, threshold=NA, model = c("mixed", "bd", "yule"),
cut = c("both","stem","node"), stepBack = TRUE,
init = c(r=0.05, epsilon=0.5), ncores = NULL, verbose = FALSE, ...)
Arguments
phy |
an ultrametric phylogenetic tree of class 'phylo' |
richness |
an optional matrix of taxonomic richnesses (see Details) |
criterion |
an information criterion to use as stopping criterion |
partitions |
the maximum number of models to be considered |
threshold |
the improvement in AIC score that should be considered significant (see Details) |
model |
the flavor of piecewise models to consider (see Details) |
cut |
determines where shifts are placed on the tree (see Details) |
stepBack |
determines whether parameter/model removal is considered. default = TRUE. (see Details) |
init |
initial conditions for net diversification and relative extinction |
ncores |
the number of processor cores to using in parallel processing. default = all |
verbose |
print out extra information. default = FALSE |
... |
additional arguments to be passed to |
Details
The MEDUSA model fits increasingly complex diversification models to a dataset including richness
information for sampled tips in phy
. The tree must have branch lengths proportional to time. The richness
object is optional, but must be given if the tree is not completely sampled. MEDUSA assumes that the entire extant diversity in the group is sampled either in phy
or given by information contained within the richness
object. The richness
object associates species richness with lineages sampled in the tree. For instance, if a genus containing a total of 10 species is exemplied in the tree by a single tip, the total diversity of the clade must be recorded in the richness
object (see Examples). All taxa missing from the tree have to be assigned to one of the tips in the richness
matrix. If the richness
object is NULL
, the tree is assumed to be completely sampled.
The algorithm first fits a single diversification model to the entire dataset. A series of single breakpoints in the diversification process is then added, so that different parts of the tree evolve with different parameter values (per-lineage net diversification–r
and relative extinction rates–epsilon
). Initial values for these diversification parameters are given through the init
argument and may need to be tailored for particular datasets. The algorithm compares all single-breakpoint models to the initial model, and retains the best breakpoint. Then all possible two-breakpoint models are compared with the best single-breakpoint model, and so on. Breakpoints may be considered at a "node"
, a "stem"
branch, or both (as dictated by the cut
argument). Birth-death or pure-birth (Yule) processes (or a combination of these processes) may be considered by the MEDUSA algorithm. The model flavor is determined through the model
argument.
Two stopping criteria are available for the MEDUSA algorithm. The user can either limit the number of piecewise models explored by MEDUSA or this number may be determined based on model fits. A maximum number of model partitions to be explored may be given a priori (if given a non-zero number through the partitions
argument) or an information criterion
is used to choose sufficient model complexity. If the latter stopping criterion is used, one needs to specify whether to use Akaike information criterion ("aic"
) or sample-size corrected AIC ("aicc"
); the latter is recommended, and is the default setting. An appropriate threshold in AICc differences between different MEDUSA models has been shown to be dependent on tree size. The threshold used for model selection is computed internally (and is based on extensive simulation study); this value is reported to the user. The user may choose to specify an alternative AIC-threshold with the ("threshold"
) argument, making the algorithm more (or less) strict in scrutinizing model improvement.
The user will almost certainly want to summarize the object returned from MEDUSA with the function TBA.
Value
A list object is returned including fits for all model complexities as well as summary information:
control |
is a list object specifying the stopping criterion, the information criterion and threshold used (if appropriate), or the number of partitions explored |
cache |
is a list object primarily used internally (including the tree and richness information) |
models |
is a list object containing each optimized piecewise model, primarily for internal use |
summary |
is a dataframe containing breakpoints and fit values for optimal models at each model complexity. If |
FUN |
is a function used to summarize a particular model (indexed by number) and is primarily for internal use |
Author(s)
JW Brown <phylo.jwb@gmail.com>, RG FitzJohn, ME Alfaro, LJ Harmon, and JM Eastman
References
Alfaro, ME, F Santini, C Brock, H Alamillo, A Dornburg, DL Rabosky, G Carnevale, and LJ Harmon. 2009. Nine exceptional radiations plus high turnover explain species diversity in jawed vertebrates. Proceedings of the National Academy of Sciences 106: 13410-13414.
See Also
Examples
dat=get(data(whales))
phy=dat$phy
richness=dat$richness
## USING AICc as STOPPING CRITERION
res1=medusa(phy, richness, warnings=FALSE)
print(names(res1)) # output list elements
print(res1$summary) # show 'summary' object
summary(res1, criterion="aicc") # select best model based on AICc
## PLOTTING RESULTS
# plot breakpoints for the best model chosen by AICc
# invoking plot.medusa()
plot(res1, cex=0.5,label.offset=1, edge.width=2)