estimateEM {PhylogeneticEM} | R Documentation |
Perform One EM
Description
EstimateEM
performs one EM for one given number of shifts. It is called
from function PhyloEM
. Its use is mostly internal, and most user
should not need it.
Usage
estimateEM(
phylo,
Y_data,
Y_data_imp = Y_data,
process = c("BM", "OU", "scOU", "rBM"),
independent = FALSE,
tol_EM = list(variance = 10^(-2), value.root = 10^(-2), exp.root = 10^(-2), var.root =
10^(-2), selection.strength = 10^(-2), normalized_half_life = 10^(-2), log_likelihood
= 10^(-2)),
Nbr_It_Max = 500,
method.variance = c("simple", "upward_downward"),
method.init = c("default", "lasso"),
method.init.alpha = c("default", "estimation"),
method.init.alpha.estimation = c("regression", "regression.MM", "median"),
nbr_of_shifts = 0,
random.root = TRUE,
stationary.root = TRUE,
alpha_known = FALSE,
eps = 10^(-3),
known.selection.strength = 1,
init.selection.strength = 1,
max_selection.strength = 100,
use_sigma_for_lasso = TRUE,
max_triplet_number = 10000,
min_params = list(variance = 0, value.root = -10^(5), exp.root = -10^(5), var.root = 0,
selection.strength = 0),
max_params = list(variance = 10^(5), value.root = 10^(5), exp.root = 10^(5), var.root =
10^(5), selection.strength = 10^(5)),
var.init.root = diag(1, nrow(Y_data)),
variance.init = diag(1, nrow(Y_data), nrow(Y_data)),
methods.segmentation = c("lasso", "same_shifts", "best_single_move"),
check.tips.names = FALSE,
times_shared = NULL,
distances_phylo = NULL,
subtree.list = NULL,
T_tree = NULL,
U_tree = NULL,
h_tree = NULL,
F_moments = NULL,
tol_half_life = TRUE,
warning_several_solutions = TRUE,
convergence_mode = c("relative", "absolute"),
check_convergence_likelihood = TRUE,
sBM_variance = FALSE,
method.OUsun = c("rescale", "raw"),
K_lag_init = 0,
allow_negative = FALSE,
trait_correlation_threshold = 0.9,
...
)
Arguments
phylo |
A phylogenetic tree of class |
Y_data |
Matrix of data at the tips, size p x ntaxa. Each line is a trait, and each column is a tip. The column names are checked against the tip names of the tree. |
Y_data_imp |
(optional) imputed data if previously computed, same format as
|
process |
The model used for the fit. One of "BM" (for a full BM model, univariate or multivariate); "OU" (for an OU with independent traits, univariate or multivariate); or "scOU" (for a "scalar OU" model, see details). |
independent |
Are the trait assumed to be independent from one another? Default to FALSE. OU in a multivariate setting only works if TRUE. |
tol_EM |
the tolerance for the convergence of the parameters. A named list, with items:
|
Nbr_It_Max |
the maximal number of iterations of the EM allowed. Default to 500 iterations. |
method.variance |
Algorithm to be used for the moments computations at the E step. One of "simple" for the naive method; of "upward_downward" for the Upward Downward method (usually faster). Default to "upward_downward". |
method.init |
The initialization method. One of "lasso" for the LASSO base initialization method; or "default" for user-specified initialization values. Default to "lasso". |
method.init.alpha |
For OU model, initialization method for the selection
strength alpha. One of "estimation" for a cherry-based initialization, using
|
method.init.alpha.estimation |
If method.init.alpha="estimation",
choice of the estimation(s) methods to be used. Choices among "regression",
(method="M" is passed to |
nbr_of_shifts |
the number of shifts allowed. |
random.root |
whether the root is assumed to be random (TRUE) of fixed (FALSE). Default to TRUE |
stationary.root |
whether the root is assumed to be in the stationary state. Default to TRUE. |
alpha_known |
is the selection strength assumed to be known ? Default to FALSE. |
eps |
tolerance on the selection strength value before switching to a BM. Default to 10^(-3). |
known.selection.strength |
if |
init.selection.strength |
(optional) a starting point for the selection strength value. |
max_selection.strength |
the maximal value allowed of the selection strength. Default to 100. |
use_sigma_for_lasso |
whether to use the first estimation of the variance matrix in the lasso regression. Default to TRUE. |
max_triplet_number |
for the initialization of the selection strength value (when estimated), the maximal number of triplets of tips to be considered. |
min_params |
a named list containing the minimum allowed values for the parameters. If the estimation is smaller, then the EM stops, and is considered to be divergent. Default values:
|
max_params |
a named list containing the maximum allowed values for the parameters. If the estimation is larger, then the EM stops, and is considered to be divergent. Default values:
|
var.init.root |
optional initialization value for the variance of the root. |
variance.init |
optional initialization value for the variance. |
methods.segmentation |
For OU, method(s) used at the M step to find new candidate shifts positions. Choices among "lasso" for a LASSO-based algorithm; and "best_single_move" for a one-move at a time based heuristic. Default to both of them. Using only "lasso" might speed up the function a lot. |
check.tips.names |
whether to check the tips names of the tree against the column names of the data. Default to TRUE. |
times_shared |
(optional) times of shared ancestry of all nodes and tips,
result of function |
distances_phylo |
(optional) phylogenetic distances, result of function
|
subtree.list |
(optional) tips descendants of all the edges, result of
function |
T_tree |
(optional) matrix of incidence of the tree, result of function
|
U_tree |
(optional) full matrix of incidence of the tree, result of function
|
h_tree |
(optional) total height of the tree. |
F_moments |
(optional, internal) |
tol_half_life |
should the tolerance criterion be applied to the phylogenetic half life (TRUE, default) or to the raw selection strength ? |
warning_several_solutions |
whether to issue a warning if several equivalent solutions are found (default to TRUE). |
convergence_mode |
one of "relative" (the default) or "absolute". Should the tolerance be applied to the raw parameters, or to the renormalized ones ? |
check_convergence_likelihood |
should the likelihood be taken into consideration for convergence assessment ? (default to TRUE). |
sBM_variance |
Is the root of the BM supposed to be random and "stationary"? Used for BM equivalent computations. Default to FALSE. |
method.OUsun |
Method to be used in univariate OU. One of "rescale" (rescale the tree to fit a BM) or "raw" (directly use an OU, only available for univariate processes). |
K_lag_init |
Number of extra shifts to be considered at the initialization step. Increases the accuracy, but can make computations quite slow of taken too high. Default to 5. |
allow_negative |
whether to allow negative values for alpha (Early Burst).
See documentation of |
trait_correlation_threshold |
the trait correlation threshold to stop the analysis. Default to 0.9. |
... |
Further arguments to be passed to |
Details
See documentation of PhyloEM
for further details.
All the parameters monitoring the EM (like tol_EM
, Nbr_It_Max
, etc.)
can be called from PhyloEM
.
Value
An object of class EstimateEM
.