node.averager {phylopairs} | R Documentation |
node.averager
Description
Calculate weighted or unweighted node-averaged values of a lineage-pair trait.
Usage
node.averager(dataset, tree, taxacolumns, varb,
weighted=FALSE, prune=TRUE, av=TRUE)
Arguments
dataset |
A data.frame in which each row corresponds to a pair in the dataset. Must contain two columns of taxa names (one for each taxon in every pair) and at least one data column with the pairwise-defined trait that is to be averaged. Taxa names must be in same format as that used in the tree. |
tree |
An ultrametric phylogenetic tree ('phylo' object) containing the species that appear in at least one pair in the dataset. Names must be in the same format as those used in 'dataset'. |
taxacolumns |
Character vector containing the column names for the two columns containing species names (e.g. c("sp1", "sp2")) |
varb |
The variable to be averaged (e.g. "RI", "range_overlap", etc.) |
weighted |
Logical indicating whether weighted node averages are to be calculated; defaults to FALSE. |
prune |
Logical indicating whether tree should be pruned to contain just the species represented in the dataset; defaults to TRUE. |
av |
Logical indicating whether to average the values of multiple entries for the same pair, should they appear in the dataset; defaults to TRUE. If set to FALSE, function will stop in the case of more than one entry in the dataset corresponding to the same pair. |
Details
node.averager()
takes a lineage-pair dataset and a phylogenetic tree and
returns the average value of the pairwise-defined trait at each node. It calculates,
at each node in the tree, the average value of a pairwise-defined trait for all pairs
whose species span the node.
The simple or 'unweighted' average is calculated as introduced by Coyne and Orr (1989). The 'weighted' node averaging procedure was introduced by Fitzpatrick (2002) and discussed in Fitzpatrick and Turelli (2006). In weighted averaging, the trait values for the pairs spanning a node are first halved K-1 times, where K is the number of nodes between the species in a pair. These halved values are then summed to get arrive at a weighted average for a node.
Note: For datasets containing many individual species, the best available
tree might be very large. By default, node.averager()
prunes the tree to
contain just the species represented in the dataset. Beware that pruning can
affect both the number of nodes and the node averaged values (for example by
altering the number of nodes between pairs when calculating weighted node averages).
Value
Numeric vector of node averages, named according to node indices.
References
Coyne, J. A., Orr, H. A. 1989. Patterns of speciation in Drosophila. Evolution 43:362-381.
Fitzpatrick, B. M. 2002. Molecular correlates of reproductive isolation. Evolution 56:191-198.
Fitzpatrick, B. M., Turelli. 2006. The geography of mammalian speciation: mixed signals from phylogenies and range maps. Evolution 60:601-615.
Examples
# Load simulated dataset and tree
data(data1)
data(sim.tree1)
# Perform node averaging
unwtd <- node.averager(dataset = data1, tree = sim.tree1, varb = "pred",
taxacolumns = c("sp1", "sp2"))
wtd <- node.averager(dataset = data1, tree = sim.tree1, varb = "pred",
taxacolumns = c("sp1", "sp2"), weighted = TRUE)
# Compare outcomes of weighted and unweighted node averaging
unwtd
wtd
summary(unwtd)
summary(wtd)
# Calculate data loss
nrow(data1) - length(wtd)
nrow(data1) - length(unwtd)
# Plot tree and node labels
library(ape)
plot(sim.tree1)
nodelabels()