| read.nexus.data {ape} | R Documentation |
Read Character Data In NEXUS Format
Description
read.nexus.data reads a file with sequences in the NEXUS
format. nexus2DNAbin is a helper function to convert the output
from the previous function into the class "DNAbin".
For the moment, only sequence data (DNA or protein) are supported.
Usage
read.nexus.data(file)
nexus2DNAbin(x)
Arguments
file |
a file name specified by either a variable of mode character, or a double-quoted string. |
x |
an object output by |
Details
This parser tries to read data from a file written in a restricted NEXUS format (see examples below).
Please see files ‘data.nex’ and ‘taxacharacters.nex’ for examples of formats that will work.
Some noticeable exceptions from the NEXUS standard (non-exhaustive list):
-
I: Comments must be either on separate lines or at the end of lines. Examples:
[Comment]— OK
Taxon ACGTACG [Comment]— OK
[Comment line 1Comment line 2]— NOT OK!
Tax[Comment]on ACG[Comment]T— NOT OK! -
II: No spaces (or comments) are allowed in the sequences. Examples:
name ACGT— OK
name AC GT— NOT OK! -
III: No spaces are allowed in taxon names, not even if names are in single quotes. That is, single-quoted names are not treated as such by the parser. Examples:
Genus_species— OK
'Genus_species'— OK
'Genus species'— NOT OK! -
IV: The trailing
endthat closes thematrixmust be on a separate line. Examples:
taxon AACCGGTend;— OK
taxon AACCGGT;end;— OK
taxon AACCCGT; end;— NOT OK! -
V: Multistate characters are not allowed. That is, NEXUS allows you to specify multiple character states at a character position either as an uncertainty,
(XY), or as an actual appearance of multiple states,{XY}. This is information is not handled by the parser. Examples:
taxon 0011?110— OK
taxon 0011{01}110— NOT OK!
taxon 0011(01)110— NOT OK! -
VI: The number of taxa must be on the same line as
ntax. The same applies tonchar. Examples:
ntax = 12— OK
ntax =12— NOT OK! -
VII: The word “matrix” can not occur anywhere in the file before the actual
matrixcommand, unless it is in a comment. Examples:
BEGIN CHARACTERS;TITLE 'Data in file "03a-cytochromeB.nex"';DIMENSIONS NCHAR=382;FORMAT DATATYPE=Protein GAP=- MISSING=?;["This is The Matrix"]— OKMATRIX
BEGIN CHARACTERS;TITLE 'Matrix in file "03a-cytochromeB.nex"';— NOT OK!DIMENSIONS NCHAR=382;FORMAT DATATYPE=Protein GAP=- MISSING=?;MATRIX
Value
A list of sequences each made of a single vector of mode character where each element is a (phylogenetic) character state.
Author(s)
Johan Nylander, Thomas Guillerme, and Klaus Schliep
References
Maddison, D. R., Swofford, D. L. and Maddison, W. P. (1997) NEXUS: an extensible file format for systematic information. Systematic Biology, 46, 590–621.
See Also
read.nexus, write.nexus,
write.nexus.data
Examples
## Use read.nexus.data to read a file in NEXUS format into object x
## Not run: x <- read.nexus.data("file.nex")