node.averager {phylopairs}R Documentation

node.averager

Description

Calculate weighted or unweighted node-averaged values of a lineage-pair trait.

Usage

node.averager(dataset, tree, taxacolumns, varb, 
  weighted=FALSE, prune=TRUE, av=TRUE)

Arguments

dataset

A data.frame in which each row corresponds to a pair in the dataset. Must contain two columns of taxa names (one for each taxon in every pair) and at least one data column with the pairwise-defined trait that is to be averaged. Taxa names must be in same format as that used in the tree.

tree

An ultrametric phylogenetic tree ('phylo' object) containing the species that appear in at least one pair in the dataset. Names must be in the same format as those used in 'dataset'.

taxacolumns

Character vector containing the column names for the two columns containing species names (e.g. c("sp1", "sp2"))

varb

The variable to be averaged (e.g. "RI", "range_overlap", etc.)

weighted

Logical indicating whether weighted node averages are to be calculated; defaults to FALSE.

prune

Logical indicating whether tree should be pruned to contain just the species represented in the dataset; defaults to TRUE.

av

Logical indicating whether to average the values of multiple entries for the same pair, should they appear in the dataset; defaults to TRUE. If set to FALSE, function will stop in the case of more than one entry in the dataset corresponding to the same pair.

Details

node.averager() takes a lineage-pair dataset and a phylogenetic tree and returns the average value of the pairwise-defined trait at each node. It calculates, at each node in the tree, the average value of a pairwise-defined trait for all pairs whose species span the node.

The simple or 'unweighted' average is calculated as introduced by Coyne and Orr (1989). The 'weighted' node averaging procedure was introduced by Fitzpatrick (2002) and discussed in Fitzpatrick and Turelli (2006). In weighted averaging, the trait values for the pairs spanning a node are first halved K-1 times, where K is the number of nodes between the species in a pair. These halved values are then summed to get arrive at a weighted average for a node.

Note: For datasets containing many individual species, the best available tree might be very large. By default, node.averager() prunes the tree to contain just the species represented in the dataset. Beware that pruning can affect both the number of nodes and the node averaged values (for example by altering the number of nodes between pairs when calculating weighted node averages).

Value

Numeric vector of node averages, named according to node indices.

References

Coyne, J. A., Orr, H. A. 1989. Patterns of speciation in Drosophila. Evolution 43:362-381.

Fitzpatrick, B. M. 2002. Molecular correlates of reproductive isolation. Evolution 56:191-198.

Fitzpatrick, B. M., Turelli. 2006. The geography of mammalian speciation: mixed signals from phylogenies and range maps. Evolution 60:601-615.

Examples

# Load simulated dataset and tree
data(data1)
data(sim.tree1)

# Perform node averaging
unwtd <- node.averager(dataset = data1, tree = sim.tree1, varb = "pred", 
  taxacolumns = c("sp1", "sp2"))
wtd <- node.averager(dataset = data1, tree = sim.tree1, varb = "pred", 
  taxacolumns = c("sp1", "sp2"), weighted = TRUE)

# Compare outcomes of weighted and unweighted node averaging
unwtd
wtd
summary(unwtd)
summary(wtd)

# Calculate data loss
nrow(data1) - length(wtd)
nrow(data1) - length(unwtd)


# Plot tree and node labels
library(ape)
plot(sim.tree1)
nodelabels()


[Package phylopairs version 0.1.0 Index]