fit_incremental_angmix {BAMBI}  R Documentation 
Stepwise fitting of angular mixture models with incremental component sizes and optimum model selection
Description
Stepwise fitting of angular mixture models with incremental component sizes and optimum model selection
Usage
fit_incremental_angmix(
model,
data,
crit = "LOOIC",
start_ncomp = 1,
max_ncomp = 10,
L = NULL,
fn = mean,
fix_label = NULL,
form = 2,
start_par = NULL,
prev_par = TRUE,
logml_maxiter = 10000,
return_all = FALSE,
save_fits = FALSE,
save_file = NULL,
save_dir = "",
silent = FALSE,
return_llik_contri = (crit %in% c("LOOIC", "WAIC")),
use_best_chain = TRUE,
alpha = 0.05,
bonferroni_alpha = TRUE,
bonferroni_adj_type = "decreasing",
...
)
Arguments
model 
angular model whose mixtures are to be fitted. Available choices are 
data 
data matrix (if bivarate, in which case it must have two columns) or vector. If outside, the values
are transformed into the scale 
crit 
model selection criteria, one of 
start_ncomp 
starting component size. A single component model is fitted if 
max_ncomp 
maximum number of components allowed in the mixture model. 
L 
HMC tuning parameter (trajectory length) passed to fit_angmix. Can be a numeric vetor (or scalar), in which case
the same 
fn 
function to evaluate on MCMC samples to estimate parameters.
Defaults to 
fix_label 
logical. Should the label switchings on the current fit (only the corresponding "best chain" if 
form 
form of crit to be used. Available choices are 1 and 2. Used only if 
start_par 
list with elements 
prev_par 
logical. Should the MAP estimated parameters from the model with 
logml_maxiter 
maximum number of iterations ( 
return_all 
logical. Should all angmcmc objects obtained during stepwise run be returned? *Warning*: depending on the
sizes of 
save_fits 
logical. Should the intermediate angmcmc objects obtained during stepwise run be saved
to file using save? Defaults to TRUE. See 
save_file , save_dir 

silent 
logical. Should the current status (such as what is the current component labels, which job is being done etc.)
be printed? Defaults to 
return_llik_contri 
passed to fit_angmix. By default, set to 
use_best_chain 
logical. Should only the "best" chain obtained during each intermediate fit be used during computation of model selection criterion? Here "best" means the chain with largest (mean over iterations) logposterior density. This can be helpful if one of the chains gets stuck at local optima. Defaults to TRUE. 
alpha 
significance level used in the test H_0K: expected log predictive density (elpd) for the fitted model with K components >= elpd for the fitted model
with K + 1 components if 
bonferroni_alpha 
logical. Should a Bonferroni correction be made on the test size 
bonferroni_adj_type 
character string. Denoting type of Bonferroni adjustment to make.
Possible choices are 
... 
additional arguments passed to fit_angmix. 
Details
The goal is to fit an angular mixture model with an optimally chosen component size K.
To obtain an optimum K, mixture models with incremental component sizes
between start_ncomp
and max_ncomp
are fitted incrementally using fit_angmix,
starting from K = 1.
If the model selection criterion crit
is "LOOIC"
or "WAIC"
, then a test of hypothesis
H_0K: expected log predictive density (elpd) for the fitted model with K components >= elpd for the fitted model
with K + 1 components, is performed at every K >= 1. The teststatistic used for the test is an approximate zscore
based on the normalized estimated elpd difference between the two models obtained from compare, which provides
estimated elpd difference along with its standard error estimate. Because the computed standard error of elpd difference
can be overly optimistic when the elpd difference is small (in particular < 4),
a conservative worstcase estimate (equal to twice of the computed standard error)
is used in such cases. To account for multiplicity among the M =
(max_ncomp
 start_ncomp
) possible sequential tests performed,
by default a Bonferroni adjustment to the test level alpha
is made.
Set bonferroni_alpha = FALSE
to remove the adjustment. To encourage
parsimony in the final model, by default (bonferroni_adj_type = "decreasing"
)
a decreasing sequence of adjusted alphas of the form alpha * (0.5)^(1:M) / sum((0.5)^(1:M))
is used. Set bonferroni_adj_type = "equal"
to use equal sequence of adjusted alphas (i.e., alpha/M
) instead.
The incremental fitting stops if H_0K cannot be rejected
(at level alpha
) for some K >= 1; this K is then regarded as the optimum number of components.
If crit
is not "LOOIC"
or "WAIC"
then mixture model with the first minimum value of the model selection criterion crit
is taken as the best model.
Note that in each intermediate fitted model, the total number of components (instead of the number of
"nonempty components") in the model is used to estimate of the true component
size, and then the fitted model is penalized for model complexity (via the model selection criterion used).
This approach of selecting an optimal K follows the perspective "let two component specific parameters
be identical" for overfitting mixtures, and as such the Dirichlet prior hyperparameters pmix.alpha
(passed to fit_angmix) should be large. See FruhwirthSchnatter (2011) for more deltails.
Note that the stability of bridge_sampler used in marginal likelihood estimation heavily depends on stationarity of the
chains. As such, while using this criterion, we recommending running the chain long engouh, and setting fix_label = TRUE
for optimal performance.
Value
Returns a named list (with class = stepfit
) with the following seven elements:
fit.all
(if return_all = TRUE
)  a list all angmcmc objects created at each component size;
fit.best
 angmcmc object corresponding to the optimum component size;
ncomp.best
 optimum component size (integer);
crit
 which model comparison criterion used (one of "LOOIC", "WAIC", "AIC", "BIC", "DIC"
or "LOGML"
);
crit.all
 all crit
values calculated (for all component sizes);
crit.best
 crit
value for the optimum component size; and
maxllik.all
 maximum (obtained from MCMC iterations) log likelihood for all fitted models
maxllik.best
 maximum log likelihodd for the optimal model; and
check_min
 logical; is the optimum component size less than max_ncomp
?
References
FruhwirthSchnatter, S.: Label switching under model uncertainty. In: Mengerson, K., Robert, C., Titterington, D. (eds.) Mixtures: Estimation and Application, pp. 213239. Wiley, New York (2011).
Examples
# illustration only  more iterations needed for convergence
set.seed(1)
fit.vmsin.step.15 < fit_incremental_angmix("vmsin", tim8, "BIC", start_ncomp = 1,
max_ncomp = 3, n.iter = 15,
n.chains = 1, save_fits=FALSE)
(fit.vmsin.best.15 < bestmodel(fit.vmsin.step.15))
lattice::densityplot(fit.vmsin.best.15)