multi.ace {dispRity} | R Documentation |
Ancestral states estimations with multiple trees
Description
Fast ancestral states estimations run on multiple trees using the Mk model from castor::asr_mk_model.
Usage
multi.ace(
data,
tree,
models = "ER",
threshold = TRUE,
special.tokens,
special.behaviours,
brlen.multiplier,
verbose = FALSE,
parallel = FALSE,
output,
castor.options,
estimation.details = NULL
)
Arguments
data |
A |
tree |
A |
models |
A |
threshold |
either |
special.tokens |
optional, a named |
special.behaviours |
optional, a |
brlen.multiplier |
optional, a vector of branch length modifiers (e.g. to convert time branch length in changes branch length) or a list of vectors (the same length as |
verbose |
|
parallel |
|
output |
optional, see Value section below. |
castor.options |
optional, a named list of options to be passed to function called by |
estimation.details |
optional, whether to also return the details for each estimation as returned by |
Details
The models
argument can be a single or a list of transition matrix
, a single or a a vector of built-in model(s) (see below) or a list of both matrices and built-in models:
The available built-in models in castor::asr_mk_model
are:
-
"ER"
for all equal rates -
"SYM"
for symmetric rates -
"ARD"
all rates are different -
"SUEDE"
equal stepwise transitions (e.g. for meristic/counting characters) -
"SRD"
different stepwise transitions
See directly castor::asr_mk_model
for more models.
The threshold
option allows to convert ancestral states likelihoods into discrete states. When threshold = FALSE
, the ancestral state estimated is the one with the highest likelihood (or at random if likelihoods are equal). When threshold = TRUE
, the ancestral state estimated are all the ones that are have a scaled likelihood greater than the maximum observed scaled likelihood minus the inverse number of possible states (i.e. select_state >= (max(likelihood) - 1/n_states)
). This option makes the threshold selection depend on the number of states (i.e. if there are more possible states, a lower scaled likelihood for the best state is expected). Finally using a numerical value for the threshold option (e.g. threshold = 0.95
) will simply select only the ancestral states estimates with a scaled likelihood equal or greater than the designated value. This option makes the threshold selection absolute. Regardless, if more than one value is select, the uncertainty token (special.tokens["uncertainty"]
) will be used to separate the states. If no value is selected, the uncertainty token will be use between all observed characters (special.tokens["uncertainty"]
).
special.behaviours
allows to generate a special rule for the special.tokens
. The functions should can take the arguments character, all_states
with character
being the character that contains the special token and all_states
for the character (which is automatically detected by the function). By default, missing data returns and inapplicable returns all states, and polymorphisms and uncertainties return all present states.
missing = function(x,y) y
inapplicable = function(x,y) y
polymorphism = function(x,y) strsplit(x, split = "\\&")[[1]]
uncertainty = function(x,y) strsplit(x, split = "\\/")[[1]]
Functions in the list must be named following the special token of concern (e.g. missing
), have only x, y
as inputs and a single output a single value (that gets coerced to integer
automatically). For example, the special behaviour for the special token "?"
can be coded as: special.behaviours = list(missing = function(x, y) return(NA)
to make ignore the character for taxa containing "?"
.
When using the parallel option (either through using parallel = TRUE
by using the number of available cores minus on or manually setting the number of cores - e.g. parallel = 5
), the castor::asr_mk_model
function will use the designated number of cores (using the option Nthreads = <requested_number_of_cores>
). Additionally, if the input tree
is a "multiPhylo"
object, the trees will be run in parallel for each number of cores, thus decreasing computation time accordingly (e.g. if 3 cores are requested and tree
contains 12 "phylo"
objects, 4 different "phylo"
objects will be run in parallel on the 3 cores making the calculation around 3 times faster).
Value
Returns a "matrix"
or "list"
of ancestral states. By default, the function returns the ancestral states in the same format as the input matrix
. This can be changed using the option output = "matrix"
or "list"
to force the class of the output.
To output the combined ancestral states and input, you can use "combined"
(using the input format) or "combined.matrix"
or "combined.list"
.
Author(s)
Thomas Guillerme
See Also
castor::asr_mk_model
, char.diff
Examples
set.seed(42)
## A simple example:
## A random tree with 10 tips
tree <- rcoal(10)
## Setting up the parameters
my_rates = c(rgamma, rate = 10, shape = 5)
## A random Mk matrix (10*50)
matrix_simple <- sim.morpho(tree, characters = 50, model = "ER", rates = my_rates,
invariant = FALSE)
## Run a basic ancestral states estimations
ancestral_states <- multi.ace(matrix_simple, tree)
ancestral_states[1:5, 1:5]
## A more complex example
## Create a multiple list of 5 trees
multiple_trees <- rmtree(5, 10)
## Modify the matrix to contain missing and special data
matrix_complex <- matrix_simple
matrix_complex[sample(1:length(matrix_complex), 50)] <- "-"
matrix_complex[sample(1:length(matrix_complex), 50)] <- "0%2"
matrix_complex[sample(1:length(matrix_complex), 50)] <- "?"
matrix_complex[1:5,1:5]
## Set a list of extra special tokens
my_spec_tokens <- c("weirdtoken" = "%")
## Set some special behaviours for the "weirdtoken" and for "-" and "?"
my_spec_behaviours <- list()
## Inapplicable tokens "-" are ignored
my_spec_behaviours$inapplicable <- function(x,y) return(NA)
## Missing tokens "?" are considered as all states
my_spec_behaviours$missing <- function(x,y) return(y)
## Weird tokens are considered as state 0 and 3
my_spec_behaviours$weirdtoken <- function(x,y) return(c(1,2))
## Create a random branch length modifier to apply to each tree
branch_lengths <- rnorm(18)^2
## Setting a list of model ("ER" for the 25 first characters and then "SYM")
my_models <- c(rep("ER", 25), rep("SYM", 25))
## Run the ancestral states on all the tree with multiple options
ancestral_states <- multi.ace(matrix_complex, multiple_trees,
verbose = TRUE,
models = my_models,
threshold = 0.95,
special.tokens = my_spec_tokens,
special.behaviours = my_spec_behaviours,
brlen.multiplier = branch_lengths,
output = "combined.matrix")
## The results for the the two first characters for the first tree
ancestral_states[[1]][, 1:2]
## Not run:
## The same example but running in parallel
ancestral_states <- multi.ace(matrix_complex, multiple_trees,
verbose = TRUE,
models = my_models,
threshold = 0.95,
special.tokens = my_spec_tokens,
special.behaviours = my_spec_behaviours,
brlen.multiplier = branch_lengths,
output = "combined.matrix",
parallel = TRUE)
## End(Not run)