| estimateEM {PhylogeneticEM} | R Documentation |
Perform One EM
Description
EstimateEM performs one EM for one given number of shifts. It is called
from function PhyloEM. Its use is mostly internal, and most user
should not need it.
Usage
estimateEM(
phylo,
Y_data,
Y_data_imp = Y_data,
process = c("BM", "OU", "scOU", "rBM"),
independent = FALSE,
tol_EM = list(variance = 10^(-2), value.root = 10^(-2), exp.root = 10^(-2), var.root =
10^(-2), selection.strength = 10^(-2), normalized_half_life = 10^(-2), log_likelihood
= 10^(-2)),
Nbr_It_Max = 500,
method.variance = c("simple", "upward_downward"),
method.init = c("default", "lasso"),
method.init.alpha = c("default", "estimation"),
method.init.alpha.estimation = c("regression", "regression.MM", "median"),
nbr_of_shifts = 0,
random.root = TRUE,
stationary.root = TRUE,
alpha_known = FALSE,
eps = 10^(-3),
known.selection.strength = 1,
init.selection.strength = 1,
max_selection.strength = 100,
use_sigma_for_lasso = TRUE,
max_triplet_number = 10000,
min_params = list(variance = 0, value.root = -10^(5), exp.root = -10^(5), var.root = 0,
selection.strength = 0),
max_params = list(variance = 10^(5), value.root = 10^(5), exp.root = 10^(5), var.root =
10^(5), selection.strength = 10^(5)),
var.init.root = diag(1, nrow(Y_data)),
variance.init = diag(1, nrow(Y_data), nrow(Y_data)),
methods.segmentation = c("lasso", "same_shifts", "best_single_move"),
check.tips.names = FALSE,
times_shared = NULL,
distances_phylo = NULL,
subtree.list = NULL,
T_tree = NULL,
U_tree = NULL,
h_tree = NULL,
F_moments = NULL,
tol_half_life = TRUE,
warning_several_solutions = TRUE,
convergence_mode = c("relative", "absolute"),
check_convergence_likelihood = TRUE,
sBM_variance = FALSE,
method.OUsun = c("rescale", "raw"),
K_lag_init = 0,
allow_negative = FALSE,
trait_correlation_threshold = 0.9,
...
)
Arguments
phylo |
A phylogenetic tree of class |
Y_data |
Matrix of data at the tips, size p x ntaxa. Each line is a trait, and each column is a tip. The column names are checked against the tip names of the tree. |
Y_data_imp |
(optional) imputed data if previously computed, same format as
|
process |
The model used for the fit. One of "BM" (for a full BM model, univariate or multivariate); "OU" (for an OU with independent traits, univariate or multivariate); or "scOU" (for a "scalar OU" model, see details). |
independent |
Are the trait assumed to be independent from one another? Default to FALSE. OU in a multivariate setting only works if TRUE. |
tol_EM |
the tolerance for the convergence of the parameters. A named list, with items:
|
Nbr_It_Max |
the maximal number of iterations of the EM allowed. Default to 500 iterations. |
method.variance |
Algorithm to be used for the moments computations at the E step. One of "simple" for the naive method; of "upward_downward" for the Upward Downward method (usually faster). Default to "upward_downward". |
method.init |
The initialization method. One of "lasso" for the LASSO base initialization method; or "default" for user-specified initialization values. Default to "lasso". |
method.init.alpha |
For OU model, initialization method for the selection
strength alpha. One of "estimation" for a cherry-based initialization, using
|
method.init.alpha.estimation |
If method.init.alpha="estimation",
choice of the estimation(s) methods to be used. Choices among "regression",
(method="M" is passed to |
nbr_of_shifts |
the number of shifts allowed. |
random.root |
whether the root is assumed to be random (TRUE) of fixed (FALSE). Default to TRUE |
stationary.root |
whether the root is assumed to be in the stationary state. Default to TRUE. |
alpha_known |
is the selection strength assumed to be known ? Default to FALSE. |
eps |
tolerance on the selection strength value before switching to a BM. Default to 10^(-3). |
known.selection.strength |
if |
init.selection.strength |
(optional) a starting point for the selection strength value. |
max_selection.strength |
the maximal value allowed of the selection strength. Default to 100. |
use_sigma_for_lasso |
whether to use the first estimation of the variance matrix in the lasso regression. Default to TRUE. |
max_triplet_number |
for the initialization of the selection strength value (when estimated), the maximal number of triplets of tips to be considered. |
min_params |
a named list containing the minimum allowed values for the parameters. If the estimation is smaller, then the EM stops, and is considered to be divergent. Default values:
|
max_params |
a named list containing the maximum allowed values for the parameters. If the estimation is larger, then the EM stops, and is considered to be divergent. Default values:
|
var.init.root |
optional initialization value for the variance of the root. |
variance.init |
optional initialization value for the variance. |
methods.segmentation |
For OU, method(s) used at the M step to find new candidate shifts positions. Choices among "lasso" for a LASSO-based algorithm; and "best_single_move" for a one-move at a time based heuristic. Default to both of them. Using only "lasso" might speed up the function a lot. |
check.tips.names |
whether to check the tips names of the tree against the column names of the data. Default to TRUE. |
times_shared |
(optional) times of shared ancestry of all nodes and tips,
result of function |
distances_phylo |
(optional) phylogenetic distances, result of function
|
subtree.list |
(optional) tips descendants of all the edges, result of
function |
T_tree |
(optional) matrix of incidence of the tree, result of function
|
U_tree |
(optional) full matrix of incidence of the tree, result of function
|
h_tree |
(optional) total height of the tree. |
F_moments |
(optional, internal) |
tol_half_life |
should the tolerance criterion be applied to the phylogenetic half life (TRUE, default) or to the raw selection strength ? |
warning_several_solutions |
whether to issue a warning if several equivalent solutions are found (default to TRUE). |
convergence_mode |
one of "relative" (the default) or "absolute". Should the tolerance be applied to the raw parameters, or to the renormalized ones ? |
check_convergence_likelihood |
should the likelihood be taken into consideration for convergence assessment ? (default to TRUE). |
sBM_variance |
Is the root of the BM supposed to be random and "stationary"? Used for BM equivalent computations. Default to FALSE. |
method.OUsun |
Method to be used in univariate OU. One of "rescale" (rescale the tree to fit a BM) or "raw" (directly use an OU, only available for univariate processes). |
K_lag_init |
Number of extra shifts to be considered at the initialization step. Increases the accuracy, but can make computations quite slow of taken too high. Default to 5. |
allow_negative |
whether to allow negative values for alpha (Early Burst).
See documentation of |
trait_correlation_threshold |
the trait correlation threshold to stop the analysis. Default to 0.9. |
... |
Further arguments to be passed to |
Details
See documentation of PhyloEM for further details.
All the parameters monitoring the EM (like tol_EM, Nbr_It_Max, etc.)
can be called from PhyloEM.
Value
An object of class EstimateEM.