test.one.species.tree {phylolm} | R Documentation |
Tests the fit of a population tree to quartet concordance factor data
Description
From a set of quartet concordance factors obtained from genetic data (proportion of loci that truly have a given quartet), this function tests the adequacy of the coalescent process on a given population tree, where branch lengths indicate coalescent units.
Usage
test.one.species.tree(cf, guidetree, prep, edge.keep,
plot=TRUE, shape.correction = TRUE)
Arguments
cf |
data frame containing one row for each 4-taxon set, with taxon names in columns 1-4 and concordance factors in columns 5-7. |
guidetree |
tree of class phylo on the same taxon set as those in |
prep |
result of |
edge.keep |
Indices of edges to keep in the guide tree. All other edges are collapsed to reflect ancestral panmixia. In the tested population tree, the collapsed edges have length set to 0. |
plot |
boolean. If TRUE, a number of plots are output. |
shape.correction |
boolean. If TRUE, the shapes of all Dirichlet distributions
are corrected to be greater or equal to 1. This correction avoids Dirichlet densities
going near 0 or 1. It applies when the |
Value
alpha |
estimated |
negPseudoLoglik |
Negative pseudo log-likelihood of the population tree. |
X2 |
Chi-square statistic, from comparing the counts of outlier p-values
(in |
chisq.pval |
p-value from the chi-square test, obtained from the comparing the |
chisq.conclusion |
character string. If the chi-square test is significant, this statement says if there is an excess (or deficit) of outlier 4-taxon sets. |
outlier.table |
Table with 2 rows (observed and expected counts) and 4 columns:
number of 4-taxon sets with p-values |
outlier.pvalues |
Vector of outlier p-values, with as many entries as there
are rows in |
cf.exp |
Matrix of concordance factors expected from the estimated population tree,
with as many rows as in |
Author(s)
Cécile Ané
References
Stenz, Noah W. M., Bret Larget, David A. Baum and Cécile Ané (2015). Exploring tree-like and non-tree-like patterns using genome sequences: An example using the inbreeding plant species Arabidopsis thaliana (L.) Heynh. Systematic Biology, 64(5):809-823.
See Also
stepwise.test.tree
, test.tree.preparation
.
Examples
data(quartetCF)
data(guidetree)
prelim <- test.tree.preparation(quartetCF,guidetree) # takes 5-10 seconds
# test of panmixia: all edges collapsed, none resolved.
panmixia <- test.one.species.tree(quartetCF,guidetree,prelim,edge.keep=NULL)
panmixia[1:6]
# test of full tree: all internal edges resolved, none collapsed.
Ntaxa = length(guidetree$tip.label)
# indices of internal edges:
internal.edges = which(guidetree$edge[,2] > Ntaxa)
fulltree <- test.one.species.tree(quartetCF,guidetree,prelim,edge.keep=internal.edges)
fulltree[1:6]
# test of a partial tree, some edges (but not all) collapsed
edges2keep <- c(1,2,4,6,7,8,11,14,20,21,23,24,31,34,35,36,38,39,44,47,53)
partialTree <- test.one.species.tree(quartetCF,guidetree,prelim,edge.keep=edges2keep)
partialTree[1:5]
partialTree$outlier.table
# identify taxa most responsible for the extra outlier quartets
outlier.4taxa <- which(partialTree$outlier.pvalues < 0.01)
length(outlier.4taxa) # 483 4-taxon sets with outlier p-value below 0.01
q01 = as.matrix(quartetCF[outlier.4taxa,1:4])
sort(table(as.vector(q01)),decreasing=TRUE)
# So: Cnt_1 and Vind_1 both appear in 239 of these 483 outlier 4-taxon sets.
sum(apply(q01,1,function(x){"Cnt_1" %in% x | "Vind_1" %in% x}))
# 266 outlier 4-taxon sets have either Cnt_1 or Vind_1
sum(apply(q01,1,function(x){"Cnt_1" %in% x & "Vind_1" %in% x}))
# 212 outlier 4-taxon sets have both Cnt_1 and Vind_1