rncl {rncl} | R Documentation |
rncl: An R interface to the NEXUS Class Library
Description
rncl provides an interface to the NEXUS Class Library (NCL), a C++ library intended to parse valid NEXUS files as well as other common formats used in phylogenetic analysis. Currently, rncl focuses on parsing trees and supports both NEXUS and Newick formatted files. Because NCL is used by several phylogenetic software (e.g., MrBayes, Garli), rncl can parse files generated by these programs. However, other popular programs (including BEAST) use an extension of the NEXUS file format, and if trees can be imported, associated annotations (e.g., confidence intervals on the time since divergence) cannot.
Returns a list of the elements contained in a NEXUS file used to build phylogenetic objects in R
Usage
rncl(
file,
file.format = c("nexus", "newick"),
spacesAsUnderscores = TRUE,
char.all = TRUE,
polymorphic.convert = TRUE,
levels.uniform = TRUE,
show_progress = TRUE,
...
)
Arguments
file |
path to a NEXUS or Newick file |
file.format |
a character string indicating the type of file to be parsed. |
spacesAsUnderscores |
In the NEXUS file format white spaces are not
allowed and are represented by underscores. Therefore, NCL converts
underscores found in taxon labels in the NEXUS file into white spaces
(e.g. |
char.all |
If |
polymorphic.convert |
If TRUE (default), converts polymorphic characters to missing data (only when NEXUS file contains DATA block). |
levels.uniform |
If TRUE (default), uses the same levels for all characters (only when NEXUS file contains DATA block). |
show_progress |
If |
... |
additional parameters (currently not in use). |
Details
NCL can also parse data associated with species included in NEXUS files. If you are interested in importing such data, see the phylobase package.
NEXUS is a common file format used in phylogenetics to represent
phylogenetic trees, and other types of phylogenetic data. This
function uses NCL (the NEXUS Class Library) to parse NEXUS, Newick
or other common phylogenetic file formats, and returns the
relevant elements as a list. phylo
(from the ape package)
or phylo4
(from the phylobase package) can be constructed
from the elements contained in this list.
Value
A list that contains the elements extracted from a NEXUS or a Newick file.
-
taxaNames
A vector of the taxa names listed in the TAXA block of the NEXUS file or inferred from the tree strings (if block missing or Newick file). -
treeNames
A vector listing the names of the trees -
taxonLabelVector
A list containing as many elements as there are trees in the file. Each element is a character vector that lists the taxon names encountered in the tree string *in the order they appear*, and therefore may not match the order they are listed in the translation table. -
parentVector
A list containing as many elements as there are trees in the file. Each element is a numeric vector listing the parent node for the node given by its position in the vector. If the beginning of the vector is 5 5 6, the parent node of node 1 is 5, the parent of node 2 is 5 and the parent of node 3 is 6. The implicit root of the tree is identified with 0 (node without a parent). branchLengthVector
A list containing as many elements as there are trees in the file. Each element is a numeric vector listing the edge/branch lengths for the edges in the same order as nodes are listed in the correspondingparentVector
element. Values of -999 indicate that the value is missing for this particular edge. The implicit root as a length of 0.nodeLabelsVector
A list containing as many elements as there are trees in the file. Each element is a character vector listing the node labels in the same order as the nodes are specified in the same order as nodes are listed in the correspondingparentVector
element.trees
A character vector listing the tree strings where tip labels have been replaced by their indices in thetaxaNames
vector. They do not correspond to the numbers listed in the translation table that might be associated with the tree.dataTypes
A character vector indicating the type of data associated with the tree (e.g., “standard”).nbCharacters
A numeric vector indicating how many characters/traits are available.charLabels
A character vector listing the names of the characters/traits that are available.-
nbStates
A numeric vector listing the number of possible states for each character/trait. -
stateLabels
A character vector listing in order, all possible states for each character/trait. -
dataChr
A character vector with as many elements as there are characters/traits in the dataset. Each element is string that can be parsed by R to create a factor vector representing the data found in the file. -
isRooted
A list with as many elements as there are trees in the file. Each element is a logical indicating whether the tree is rooted. NCL definition of a rooted tree differs from the one APE uses in some cases. -
hasPolytomies
A list with as many elements as there are trees in the file. Each element is a logical indicating whether the tree contains polytomies. -
hasSingletons
A list with as many elements as there are trees in the file. Each element is a logical indicating whether the tree contains singleton nodes, in other words nodes with a single descendant (also known as knuckles).
Author(s)
Francois Michonneau
References
Maddison DR, Swofford DL, Maddison WP (1997). "NEXUS: An extensible file format for systematic information". Systematic Biology 46(4) : 590-621. doi: doi:10.1093/sysbio/46.4.590
Lewis, P. O. 2003. NCL: a C++ class library for interpreting data files in NEXUS format. Bioinformatics 19 (17) : 2330-2331.
See Also
For examples on how to use the elements of the list
returned by this function to build tree objects, inspect the
source code of this package, in particular how
read_newick_phylo
and read_nexus_phylo
work. For a
more complex example that also use the data contained in NEXUS
files, inspect the source code of the readNCL
function in
the phylobase package.