paml.baseml {phyclust} | R Documentation |
Phylogenetic Analysis by Maximum Likelihood for Nucleotide Sequences
Description
This function modifies the original standalone code of baseml
in PAML developed by Yang (1997) for phylogenetic analysis by maximum
likelihood. This function provides a way to generate an ancestral tree
for given central sequences clustered by phyclust
.
Usage
paml.baseml(X, seqname = NULL, opts = NULL, newick.trees = NULL)
paml.baseml.control(...)
paml.baseml.show.default()
Arguments
X |
nid matrix with |
seqname |
sequence names. |
opts |
options as the standalone version, provided by |
newick.trees |
a vector/list contains NEWICK trees for |
... |
for other possible opts and values. See PAML manual for details. |
show |
show opts and values. |
Details
The function paml.baseml
directly reuses the C code of baseml
of PAML, and the function paml.baseml.control
is to generate controls
for paml.baseml
as the file baseml.ctl
of PAML.
The seqname
should be consistent with X
, and the leaf nodes
of newick.trees
.
The options opts
is followed from the original baseml.ctl
except seqfile
, treefile
and outputfile
will be omitted.
paml.baseml.control
generates default opts
, and
paml.baseml.show.default
displays annotations for the default
opts
.
Value
This function returns a list, and each element stores one line of outputs
of baseml
separated by newline. The list stores in a class
baseml
. All the output of baseml
of PAML will be saved in
several files, and these will be read in by scan
. Further post
processing can be done by parsing the returning vector. The details of
output format can found on the website
http://abacus.gene.ucl.ac.uk/software/paml.html and its manual.
Note that some functionalities of baseml
of PAML are changed in
paml.baseml
due to the complexity of input and output. The changes
include such as disable the option G
and rename the file 2base.t
to pairbase.t
.
Typically, the list contains the original output of baseml
including
pairbase.t
, mlb
, rst
, rst1
, and rub
if they
are not empty. The best tree (unrooted by default) will be stored in
best.tree
parsed from mlb
based on the highest log likelihood.
All output to STDOUT
are stored in stdout
. No STDIN
are
allowed.
Note that the print function for the class baseml
will only show
the best.tree
only. Use str
or names
to see the whole
returns of the list.
Warning(s)
Carefully read the PAML's original document before using the
paml.baseml
function, and paml.baseml
may not be ported
well from baseml
of PAML. Please double check with the standalone
version.
baseml
may not be a well designed program, and may run slowly.
If it were stuck, temporary files would all store at a directory obtained
by tempfile("/paml.baseml.")
.
baseml
has its own options and settings which may be different
than phyclust and ape. For example, the following is from
the PAML's document, “In PAML, a rooted tree has a bifurcation at the root,
while an unrooted tree has a trifurcation or multifurcation at the root.”
i.e. paml.baseml
uses a rooted result for an unrooted tree, as well
as for a rooted tree.
baseml
also needs a sequence file which is dumped from R (duplicated
from memory) for paml.baseml
, and this file can be very big if
sequences are too long or number of sequences is too large. Also,
paml.baseml
may take long time to search the best tree if data are
large or initial trees are not provided.
Author(s)
Yang, Z. (1997) and Yang, Z. (2007)
Maintain: Wei-Chen Chen wccsnow@gmail.com
References
Phylogenetic Clustering Website: https://snoweye.github.io/phyclust/
Yang, Z. (1997) “PAML: a program package for phylogenetic analysis by maximum likelihood”, Computer Applications in BioSciences, 13, 555-556.
Yang, Z. (2007) “PAML 4: a program package for phylogenetic analysis by maximum likelihood”, Molecular Biology and Evolution, 24, 1586-1591. http://abacus.gene.ucl.ac.uk/software/paml.html
See Also
Examples
## Not run:
library(phyclust, quiet = TRUE)
paml.baseml.show.default()
### Generate data.
set.seed(123)
ret.ms <- ms(nsam = 5, nreps = 1, opts = "-T")
ret.seqgen <- seqgen(opts = "-mHKY -l40 -s0.2", newick.tree = ret.ms[3])
(ret.nucleotide <- read.seqgen(ret.seqgen))
X <- ret.nucleotide$org
seqname <- ret.nucleotide$seqname
### Run baseml.
opts <- paml.baseml.control(model = 4, clock = 1)
(ret.baseml <- paml.baseml(X, seqname = seqname, opts = opts))
(ret.baseml.init <- paml.baseml(X, seqname = seqname, opts = opts,
newick.trees = ret.ms[3]))
ret.ms[3]
### Unrooted tree.
opts <- paml.baseml.control(model = 4)
(ret.baseml.unrooted <- paml.baseml(X, seqname = seqname, opts = opts))
### More information.
opts <- paml.baseml.control(noisy = 3, verbose = 1, model = 4, clock = 1)
ret.more <- paml.baseml(X, seqname = seqname, opts = opts)
# ret.more$stdout
### Plot trees
par(mfrow = c(2, 2))
plot(read.tree(text = ret.ms[3]), main = "true")
plot(read.tree(text = ret.baseml$best.tree), main = "baseml")
plot(read.tree(text = ret.baseml.init$best.tree), main = "baseml with initial")
plot(unroot(read.tree(text = ret.baseml.unrooted$best.tree)),
main = "baseml unrooted")
## End(Not run)