R: Ancestral Character State Estimation

estimate_ancestral_states {Claddis}

R Documentation

Ancestral Character State Estimation

Description

Given a tree and a cladistic matrix uses likelihood to estimate the ancestral states for every character.

Usage

estimate_ancestral_states(
  cladistic_matrix,
  time_tree,
  estimate_all_nodes = FALSE,
  estimate_tip_values = FALSE,
  inapplicables_as_missing = FALSE,
  polymorphism_behaviour = "equalp",
  uncertainty_behaviour = "equalp",
  threshold = 0.01,
  all_missing_allowed = FALSE
)

Arguments

`cladistic_matrix`	A character-taxon matrix in the format imported by read_nexus_matrix.
`time_tree`	A tree (phylo object) with branch lengths that represents the relationships of the taxa in `cladistic_matrix`.
`estimate_all_nodes`	Logical that allows the user to make estimates for all ancestral values. The default (`FALSE`) will only make estimates for nodes that link coded terminals (recommended).
`estimate_tip_values`	Logical that allows the user to make estimates for tip values. The default (`FALSE`) will only makes estimates for internal nodes (recommended).
`inapplicables_as_missing`	Logical that decides whether or not to treat inapplicables as missing (TRUE) or not (FALSE, the default and recommended option).
`polymorphism_behaviour`	One of either "equalp" or "treatasmissing".
`uncertainty_behaviour`	One of either "equalp" or "treatasmissing".
`threshold`	The threshold value to use when collapsing marginal likelihoods to discrete state(s).
`all_missing_allowed`	Logical to allow all missing character values (generally not recommended, hence default is FALSE).

Details

At its' core the function uses either the rerootingMethod (Yang et al. 1995) as implemented in the phytools package (for discrete characters) or the ace function in the ape package (for continuous characters) to make ancestral state estimates. For discrete characters these are collapsed to the most likely state (or states, given equal likelihoods or likelihood within a defined threshold value). In the latter case the resulting states are represented as an uncertainty (i.e., states separated by a slash, e.g., 0/1). This is the method developed for Brusatte et al. (2014).

The function can deal with ordered or unordered characters and does so by allowing only indirect transitions (from 0 to 2 must pass through 1) or direct transitions (from 0 straight to 2), respectively. However, more complex step matrix transitions are not currently supported.

Ancestral state estimation is complicated where polymorphic or uncertain tip values exist. These are not currently well handled here, although see the fitpolyMk function in phytools for a way these could be dealt with in future. The only available options right now are to either treat multiple states as being equally probable of the "true" tip state (i.e., a uniform prior) or to avoid dealing with them completely by treating them as missing (NA) values.

It is also possible to try to use phylogenetic information to infer missing states, both for internal nodes (e.g., those leading to missing tip states) and for tips. This is captured by the estimate_all_nodes and estimate_tip_values options. These have been partially explored by Lloyd (2018), who cuationed against their use.

Value

The function will return the same cladistic_matrix, but with two key additions: 1. Internal nodes (numbered by ape formatting) will appear after taxa in each matrix block with estimated states coded for them, and 2. The time-scaled tree used will be added to cladistic_matrix as cladistic_matrix$topper$tree. Note that if using the estimate_tip_values = TRUE option then tip values may also be changed from those provided as input.

Author(s)

Graeme T. Lloyd graemetlloyd@gmail.com and Thomas Guillerme guillert@tcd.ie

References

Brusatte, S. L., Lloyd, G. T., Wang, S. C. and Norell, M. A., 2014. Gradual assembly of avian body plan culminated in rapid rates of evolution across dinosaur-bird transition. Current Biology, 24, 2386-2392.

Lloyd, G. T., 2018. Journeys through discrete-character morphospace: synthesizing phylogeny, tempo, and disparity. Palaeontology, 61, 637-645.

Yang, Z., Kumar, S. and Nei, M., 1995. A new method of inference of ancestral nucleotide and amino acid sequences. Genetics, 141, 1641-1650.

Examples


# Set random seed:
set.seed(4)

# Generate a random tree for the Day data set:
time_tree <- ape::rtree(n = nrow(day_2016$matrix_1$matrix))

# Update taxon names to match those in the data matrix:
time_tree$tip.label <- rownames(x = day_2016$matrix_1$matrix)

# Set root time by making youngest taxon extant:
time_tree$root.time <- max(diag(x = ape::vcv(phy = time_tree)))

# Use Day matrix as cladistic matrix:
cladistic_matrix <- day_2016

# Prune most characters out to make example run fast:
cladistic_matrix <- prune_cladistic_matrix(cladistic_matrix,
  characters2prune = c(2:3, 5:37)
)

# Estimate ancestral states:
estimate_ancestral_states(
  cladistic_matrix = cladistic_matrix,
  time_tree = time_tree
)

[Package Claddis version 0.6.3 Index]