MSCquartets-package {MSCquartets}R Documentation

Multispecies Coalescent Model Quartet Package

Description

A package for analyzing quartets displayed on gene trees, under the multispecies coalescent (MSC) model and network multispecies coalescet model (NMSC).

Details

This package contains routines to analyze a collection of gene trees through the displayed quartets on them.

A quartet count concordance factor (qcCF) for a set of 4 taxa is the triple of counts of the three possible resolved quartet trees on those taxa across some set of gene trees. The major routines in this package can:

  1. Tabulate all qcCFs for a collection of gene trees.

  2. Perform hypothesis tests of whether one or more qcCFs are consistent with the MSC model on a species tree (Mitchell et al. 2019).

  3. Produce simplex plots showing all estimated CFs as well as results of hypothesis tests (Allman et al. 2020).

  4. Infer a species tree using the qcCFs via the QDC and WQDC methods (Rhodes 2020; Yourdkhani and Rhodes 2020).

  5. Infer a level-1 species network via the NANUQ method (Allman et al. 2019).

  6. Infer the tree of blobs for a species network via the TINNiK method (Allman et al. 2022),(Allman et al. 2024).

As discussed in the cited works, the inference methods for species trees and networks are statistically consistent under the MSC and Network MSC respectively.

This package, and the theory on which it is based, allows gene trees to have missing taxa (i.e., not all gene trees display all the taxa). It does require that each subset of 4 taxa is displayed on at least one of the gene trees.

Several gene tree data sets, simulated and empirical, are included.

In publications please cite the software announcement (Rhodes et al. 2020), as well as the appropriate paper(s) above developing the theory behind the routines you used.

Author(s)

Maintainer: John Rhodes j.rhodes@alaska.edu (ORCID)

Authors:

References

Rhodes JA, Baños H, Mitchell JD, Allman ES (2020). “MSCquartets 1.0: Quartet methods for species trees and networks under the multispecies coalescent model in R.” Bioinformatics. doi:10.1093/bioinformatics/btaa868.

Mitchell J, Allman ES, Rhodes JA (2019). “Hypothesis testing near singularities and boundaries.” Electron. J. Statist., 13(1), 2150-2193. doi:10.1214/19-EJS1576.

Allman ES, Mitchell JD, Rhodes JA (2020). “Gene tree discord, simplex plots, and statistical tests under the coalescent.” bioRxiv. doi:10.1101/2020.02.13.948083.

Rhodes JA (2020). “Topological metrizations of trees, and new quartet methods of tree inference.” IEEE/ACM Trans. Comput. Biol. Bioinform., 17(6), 2107-2118. doi:10.1109/TCBB.2019.2917204.

Yourdkhani S, Rhodes JA (2020). “Inferring metric trees from weighted quartets via an intertaxon distance.” Bul. Math. Biol., 82(97). doi:10.1007/s11538-020-00773-4.

Allman ES, Baños H, Rhodes JA (2019). “NANUQ: A method for inferring species networks from gene trees under the coalescent model.” Algorithms Mol. Biol., 14(24), 1-25. doi:10.1186/s13015-019-0159-2.

Allman ES, Baños H, Mitchell JD, Rhodes JA (2022). “The tree of blobs of a species network: identifiability under the coalescent.” Journal of Mathematical Biology, 86(1), 10. doi:10.1007/s00285-022-01838-9.

Allman ES, Baños H, Mitchell JD, Rhodes JA (2024). “TINNIK: Inference of the Tree of Blobs of Species Networks Under the Coalescent.” draft.


[Package MSCquartets version 2.0 Index]