optim_tdv_simul_anne {diffval} | R Documentation |
Total Differential Value optimization using a Simulated Annealing (and GRASP) algorithm(s)
Description
This function searches for k
-partitions of the columns of a given matrix
(i.e., a partition of the columns in k
groups), optimizing the Total
Differential Value (TDV) using a stochastic global optimization method
called Simulated Annealing (SANN) algorithm. Optionally, a Greedy
Randomized Adaptive Search Procedure (GRASP) can be used to find a initial
partition (seed) to be passed to the SANN algorithm.
Usage
optim_tdv_simul_anne(
m_bin,
k,
p_initial = NULL,
n_runs = 10,
n_sol = 1,
t_inic = 0.3,
t_final = 1e-06,
alpha = 0.05,
n_iter = 1000,
use_grasp = TRUE,
thr = 0.95,
full_output = FALSE
)
Arguments
m_bin |
A matrix. A phytosociological table of 0s (absences) and 1s (presences), where rows correspond to taxa and columns correspond to relevés. |
k |
A numeric giving the number of desired groups. |
p_initial |
A vector of integer numbers with the partition of the
relevés (i.e., a |
n_runs |
A numeric giving the number of runs. Defaults to 10. |
n_sol |
A numeric giving the number of best solutions to keep in the
final output (only used if |
t_inic |
A numeric giving the initial temperature. Must be greater than 0 and maximum admitted value is 1. Defaults to 0.3. |
t_final |
A numeric giving the final temperature. Must be bounded between 0 and 1. Usually very low values are needed to ensure convergence. Defaults to 0.000001. |
alpha |
A numeric giving the fraction of temperature drop to be used in the temperature reduction scheme (see Details). Must be bounded between 0 and 1. Defaults to 0.05. |
n_iter |
A numeric giving the number of iterations. Defaults to 1000. |
use_grasp |
A logical. Defaults to |
thr |
A numeric giving a threshold value (from 0 to 1 ) with the
probability used to compute the sample quantile, in order to get the best
|
full_output |
A logical. Defaults to |
Details
Given a phytosociological table (m_bin
, with rows corresponding to
taxa and columns corresponding to relevés) this function searches for a
k
-partition (k
, defined by the user) optimizing the TDV, i.e.,
searches, using a SANN algorithm (optionally working upon GRASP solutions),
for a global maximum of TDV (by rearranging the relevés into k
groups).
This function uses two main algorithms:
An optional GRASP, which is used to obtain initial solutions (partitions of
m_bin
) using functionpartition_tdv_grasp()
. Such initial solutions are then submitted to the SANN algorithm.The (main) SANN algorithm, which is used to search for a global maximum of TDV. The initial partition for each run of SANN can be a partition obtained from GRASP (if
use_grasp = TRUE
) or, (ifuse_grasp = FALSE
), a partition given by the user (usingp_initial
) or a random partition (usingp_initial = "random"
).
The SANN algorithm decreases the temperature multiplying the current
temperature by 1 - alpha
according to a predefined schedule, which is
automatically calculated from the given values for t_inic
, t_final
,
alpha
and n_iter
.
Specifically, the cooling schedule is obtained calculating the number of
times that the temperature has to be decreased in order to approximate
t_final
starting from t_inic
. The number of times that the temperature
decreases, say nt
, is calculated by the expression:
floor(n_iter/((n_iter * log(1 - alpha)) / (log((1 - alpha) * t_final /
t_inic))))
.
Finally, these decreasing stages are scattered through the desired
iterations (n_iter
) homogeneously, by calculating the indices of the
iterations that will experience a decrease in temperature using
floor(n_iter / nt * (1:nt))
.
SANN is often seen as an exploratory technique where the temperature
settings are challenging and dependent on the problem. This function tries
to restrict temperature values taking into account that TDV is always
between 0 and 1. Even though, obtaining values of temperature that allow
convergence can be challenging. full_output = TRUE
allows the user to
inspect the behaviour of current.tdv
and check if convergence fails.
Generally, convergence failure can be spotted when final SANN TDV values
are similar to the initial current.tdv
, specially when coming from random
partitions. In such cases, as a rule of thumb, it is advisable to decrease
t_final
.
Value
If full_output = FALSE
(the default), a list with the following
components (the GRASP component is only returned if use_grasp = TRUE
):
- GRASP
A list with at most
n_sol
components, each one containing also a list with two components:- par
A vector with the partition of highest TDV obtained by GRASP;
- tdv
A numeric with the TDV of
par
.
- SANN
A list with at most
n_sol
components, each one containing also a list with two components:- par
A vector with the partition of highest TDV obtained by the (GRASP +) SANN algorithm(s);
- tdv
A numeric with the TDV of
par
.
If full_output = TRUE
, a list with the following components (the GRASP
component is only returned if use_grasp = TRUE
):
- GRASP
A list with
n_runs
components, each one containing also a list with two components:- par
A vector with the partition of highest TDV obtained by GRASP.
- tdv
A numeric with the TDV of
par
.
- SANN
A list with
n_runs
components, each one containing also a list with six components:- current.tdv
A vector of length
n_iter
with the current TDV of each SANN iteration.- alternative.tdv
A vector of length
n_iter
with the alternative TDV used in each SANN iteration.- probability
A vector of length
n_iter
with the probability used in each SANN iteration.- temperature
A vector of length
n_iter
with the temperature of each SANN iteration.- par
A vector with the partition of highest TDV obtained by the (GRASP +) SANN algorithm(s).
- tdv
A numeric with the TDV of
par
.
Author(s)
Jorge Orestes Cerdeira and Tiago Monteiro-Henriques. E-mail: tmh.dev@icloud.com.
Examples
# Getting the Taxus baccata forests data set
data(taxus_bin)
# Removing taxa occurring in only one relevé in order to
# reproduce the example in the original article of the data set
taxus_bin_wmt <- taxus_bin[rowSums(taxus_bin) > 1, ]
# Obtaining a partition that maximizes TDV using the Simulated Annealing
# algorithm
result <- optim_tdv_simul_anne(
m_bin = taxus_bin_wmt,
k = 3,
p_initial = "random",
n_runs = 5,
n_sol = 5,
use_grasp = FALSE,
full_output = TRUE
)
# Inspect the result
# The TDV of each run
sapply(result[["SANN"]], function(x) x$tdv)
# The best partition that was found (i.e., with highest TDV)
result[["SANN"]][[1]]$par
# A TDV of 0.1958471 indicates you are probably reproducing the three
# groups (Estrela, Gerês and Galicia) from the original article. A solution
# with TDV = 0.2005789 might also occur, but note that one group has only two
# elements. For now, a minimum group size is not implemented in function
# optim_tdv_simul_anne() as it is in the function optim_tdv_hill_climb().
# Inspect how the optimization progressed (should increase towards the right)
plot(
result[["SANN"]][[1]]$current.tdv,
type = "l",
xlab = "Iteration number",
ylab = "TDV of the currently accepted solution"
)
for (run in 2:length(result[["SANN"]])) {
lines(result[["SANN"]][[run]]$current.tdv)
}
# Plot the sorted (or tabulated) phytosociological table, using the best
# partition that was found
tabul <- tabulation(
m_bin = taxus_bin_wmt,
p = result[["SANN"]][[1]]$par,
taxa_names = rownames(taxus_bin_wmt),
plot_im = "normal"
)