constrainParPaleo {paleotree} | R Documentation |
Constrain Parameters for a Model Function from paleotree
Description
This function constrains a model to make submodels with fewer parameters,
using a structure and syntax taken from the function constrain
in Rich Fitzjohn's package diversitree
.
Usage
constrainParPaleo(f, ..., formulae = NULL, names = parnames(f), extra = NULL)
Arguments
f |
A function to constrain. This function must be of the |
... |
Formulae indicating how the function should be constrained. See details and examples for lengthy discussion. |
formulae |
Optional list of constraints, possibly in addition to
those in |
names |
Optional Character vector of names, the same length as
the number of parameters in |
extra |
Optional vector of additional names that might appear on
the RHS of constraints but do not represent names in the function's
|
Details
This function is based on (but does not depend on) the function constrain
from the package diversitree
. Users should refer to this parent function for
more detailed discussion of model constraint usage and detailed examples.
The parent function was forked to add functionality necessary for dealing with the
high parameter count models typical to some paleontological analyses, particularly
the inverse survivorship method. This necessitated that the new function be entirely
separate from its parent. Names of functions involved (both exported and not)
have been altered to avoid overlap in the package namespaces. Going forward,
the paleotree
package maintainer (Bapst) will try to be vigilant with
respect to changes in constrain
in the original package, diversitree
.
Useful information from the diversitree
manual (11/01/13):
"If f
is a function that takes a vector x
as its first
argument, this function returns a new function that takes a
shorter vector x
with some elements constrained in some way;
parameters can be fixed to particular values, constrained to be the
same as other parameters, or arbitrary expressions of free
parameters."
In general, formulae should be of the structure:
LHS ~ RHS
...where the LHS is the 'Parameter We Want to Constrain' and the RHS is whatever we are constraining the LHS to, usually another parameter. LHS and RHS are the 'left-hand side' and 'right-hand side' respectively (which I personally find obscure).
Like the original constrain
function this function is based on,
this function cannot remove constraints previously placed on a model
object and there may be cases in which the constrained function may not
make sense, leading to an error. The original function will sometimes
issue nonsensical functions with an incorrect number/names of parameters
if the parameters to be constrained are given in the wrong order in
formulae.
Differences from the original constrain
function from diversitree
This forked paleotree
version of constrain has two additional features,
both introduced to aid in constraining models with a high number of
repetitive parameters. (I did not invent these models, so don't shoot the messenger.)
First, it allows nuanced control over the constraining of many
parameters simultaneously, using the all
and match
descriptors. This
system depends on parameters being named as such: name.group1.group2.group3
and so on. Each 'group' is reference to a system of groups, perhaps referring to a
time interval, region, morphology, taxonomic group or some other discrete
characterization among the data (almost all functions envisioned for
paleotree are for per-taxon diversification models). The number of group systems
is arbitrary, and may be from zero to a very large number; it depends on the
'make' function used and the arguments selected by the user. For example, the
parameter x.1
would be for the parameter x
in the first group of the first group
system (generally a time interval for most paleotree
functions). For a more
complicated exampled, with the parameter x.1.3.1
, the third group
for the second group system (perhaps this taxonomic data point has a morphological
feature not seen in some other taxa) and group 1 of the third group system (maybe
biogeographic region 1? The possibilities are endless depending on user choices!).
The all
option work like so: if x.all ~ x.1
is given as a formulae, then all x
parameters will be constrained to equal x.1
. For example, if there is x.1
, x.2
,
x.3
and x.4
parameters for a model, x.all ~ x.1
would be equivalent to
individually giving the formulae x.2~x.1
, x.3~x.1
and x.4~x.1
. This
means that if there are many parameters of a particular type (say, 50 x
parameters) it is easy to constrain all with a short expression. It is not
necessary that the The all
term can be used anywhere in the name of the parameter
in a formulae, including to make all parameters for a given group
the same.
Furthermore, the LHS and RHS don't need to be same parameter group, and both can
contain all
statements, even multiple all
statements. Consider these
examples, each of which are legal, acceptable uses:
x.all ~ y.1
Constrains all values of the parameter x for every group to be equal to the single value for the parameter
y
for group 1 (note that there's only a single set of groups).all.1 ~ x.1
Constrains all parameters for the first group to equal each other, here arbitrary named
x.1
. For example, if there is parameters namedx.1
,y.1
andz.1
, all will be constrained to be equal to a single parameter value.x.all.all ~ y.2.3
Constrains all values for x in any and all groups to equal the single value for
y
for group 2 (of system 1) and group 3 (of system 2).x.all ~ y.all
Constrains all values of x for every group and y for every group to be equal to a single value, which by default will be reported as
y.1
The match
term is similar, allowing parameter values from the same group
to be quickly matched and made equivalent. These match
terms must have a
matching (cue laughter) term both in the corresponding LHS and RHS of the formula.
For example, consider x.match ~ y.match
where there are six parameters: x.1
,
x.2
, x.3
, y.1
, y.2
and y.3
.
This will effectively constrain x.1 ~ y.1
, x.2 ~ y.2
and x.3 ~ y.3
. This is efficient for cases where we have some parameters that
we often treat as equal. For example, in paleontology, we sometimes make a
simplifying assumption that birth and death rates are equal in multiple
time intervals. Some additional legal examples are:
x.match.1 ~ y.match.1
This will constrain only parameters of
x
andy
to to equal each other if they both belong to the same group for the first group system AND belong to group 1 of the first group.all.match. ~ x.match
This will constrain all named parameters in each group to equal each other; for example, if there are parameters
x.1
,y.1
,z.1
,x.2
,y.2
andz.2
, this will constrain them such thaty.1 ~ x.1
,z.1 ~ x.1
,y.2 ~ x.2
andz.2 ~ x.2
, leavingx.1
andx.2
as the only parameters effectively.
There are two less fortunate qualities to the introduction of the above terminology.
Unfortunately, this package author apologizes that his programming skills are
not good enough to allow more complex sets of constraints, as would be typical
with the parent function, when all
or match
terms are included. For example,
it would not be legal to attempt to constraint y.all ~ x.1 / 2
, where the user
presumably is attempting to constrain all y values to equal the x parameter
to equal half of the x
parameter for group 1. This will not be parsed as such
and should return an error. However, there are workarounds, but they require
using constrainParPaleo
more than once. For the above example, a user could
first use y.all ~ y.1
constraining all y values to be equal. Then a user
could constrain with the formula y.1 ~ x.1 / 2
which would then constrain
y.1
(and all the y
values constrained to equal it) to be equal to the desired
fraction.
Furthermore, this function expects that parameter names don't already have
period-separated terms that are identical to all
or match
.
No function in paleotree
should produce such natively.
If such were to occur, perhaps by specially replacing parameter names,
constrainParPaleo
would confuse
these terms for the specialty terms described here.
Secondly, this altered version of constrain handles the parameter bounds included as
attributes in functions output by the various 'make
' functions. This means that if
x.1 ~ y.1
is given, constrainParPaleo
will test if the bounds on x.1
and y.1
are the same.
If the bounds are not the same, constrainParPaleo
will return an error.
This is important, as some models in paleotree may make a parameter a rate (bounded
zero to some value greater than one) or a probability (bounded zero to one),
depending on user arguments. Users may not realize these differences and, in many
cases, constraining a rate to equal a probability is nonsense (absolute poppycock!).
If a user really wishes to constrain two parameters with different bounds to be equal
(I have no idea why anyone would want to do this), they can use the parameter bound
replacement functions described in modelMethods
to set the parameter
bounds as equal. Finally, once parameters with the same bounds are constrained, the
output has updated bounds that reflect the new set of parameters
for the new constrained function.
Value
Modified from the diversitree
manual:
This function returns a constrained function that can be passed
through to the optimization functions of a user's choice, such as
optim
, find.mle
in diversitree
or mcmc
.
It will behave like any other function. However, it has a modified
class
attribute so that some methods will dispatch differently:
parnames
, for example, will return the names of the
parameters of the constrained function and parInit
will
return the initial values for those same constrained set of parameters.
All arguments in addition to x
will be passed through to the
original function f
.
Additional useful information from the diversitree
manual (11/01/13):
For help in designing constrained models, the returned function has
an additional argument pars.only
, when this is TRUE
the
function will return a named vector of arguments rather than evaluate
the function (see Examples).
Author(s)
This function (and even this help file!) was originally written by Rich
Fitzjohn for his library diversitree
, and subsequently rewritten
and modified by David Bapst.
References
FitzJohn, R. G. 2012. diversitree
: comparative phylogenetic analyses of
diversification in R. Methods in Ecology and Evolution 3(6):1084-1092.
See Also
As noted above, this function is strongly based on (but does not depend on) the
function constrain
from the library diversitree
.
Examples
# simulation example with make_durationFreqCont, with three random groups
set.seed(444)
record <- simFossilRecord(
p = 0.1,
q = 0.1,
nruns = 1,
nTotalTaxa = c(30,40),
nExtant = 0
)
taxa <- fossilRecord2fossilTaxa(record)
rangesCont <- sampleRanges(taxa,r = 0.5)
# create a groupings matrix
grp1 <- matrix(
sample(1:3,nrow(taxa),replace = TRUE), , 1)
likFun <- make_durationFreqCont(rangesCont, groups = grp1)
# can constrain both extinction rates to be equal
constrainFun <- constrainParPaleo(likFun, q.2 ~ q.1)
#see the change in parameter names and bounds
parnames(likFun)
parnames(constrainFun)
parbounds(likFun)
parbounds(constrainFun)
# some more ways to constrain stuff!
#constrain all extinction rates to be equal
constrainFun <- constrainParPaleo(likFun, q.all ~ q.1)
parnames(constrainFun)
#constrain all rates for everything to be a single parameter
constrainFun <- constrainParPaleo(likFun, r.all ~ q.all)
parnames(constrainFun)
#constrain all extinction rates to be equal & all sampling to be equal
constrainFun <- constrainParPaleo(likFun, q.all ~ q.1, r.all ~ r.1)
parnames(constrainFun)
#similarly, can use match.all to make all matching parameters equal each other
constrainFun <- constrainParPaleo(likFun, match.all ~ match.all)
parnames(constrainFun)
#Constrain rates in same group to be equal
constrainFun <- constrainParPaleo(likFun, r.match ~ q.match)
parnames(constrainFun)