adjustModel {SiMRiv} | R Documentation |
Finds ("estimates") simulation input parameters able to replicate a given (real) trajectory, assuming the given species model
Description
Given a trajectory, a type of movement and the time resolution at which the user wants to simulate, this function approximates the values for the simulation input parameters so that the simulated movement is maximally similar to the given trajectory, in terms of general non-spatial patterns. If the user wants to simulate at a higher frequency than real data (which is the norm), the function maximizes the similarity between the real trajectory and the simulated trajectories (at a higher frequency) after downsampled to the same frequency as the real. It does so by running a genetic optimization algorithm.
Usage
adjustModel(realData, species.model, resolution = 10
, resistance = NULL, coords = NULL, angles = NULL
, nrepetitions = 6
, nbins.hist = if(aggregate.obj.hist) c(7, 7, 0) else c(3, 3, 0)
, step.hist.log = TRUE, nlags = 100
, window.size = dim(reference$stats)[1]%/%nlags
, aggregate.obj.hist = TRUE, step.hist.range = c(0, 1)
, popsize = 100, generations = seq(5, 1000, by=5), mprob = 0.2
, parallel = is.null(resistance), trace = TRUE
)
Arguments
realData |
the given trajectory for which the simulation input parameters are to be "estimated", given as a matrix with two columns (coordinates) and assuming that relocations are equally spaced in time. |
species.model |
the species model to adjust, created with |
resolution |
the desired time frequency of the simulations for which parameters will be approximated, as a fraction of the real data, i.e. a value of 20 will simulate movements at a 20-fold higher frequency than real data. |
resistance |
the resistance raster to use in simulations during parameter approximation. |
coords |
the initial coordinates of the simulated individuals (only relevant if |
angles |
the initial angle to which the individual is facing in that start of all simulations (only relevant if |
nrepetitions |
the number of simulations conducted for each solution evaluation during optimization. If >1, the quality of the solutions is computed by comparing the averaged histograms across repetitions, with the real histograms. |
nbins.hist |
a vector with three positive integers defining the number of histogram bins for turning angle histograms, step length histograms and turning angle variation histograms. These bins will be used during optimization to compare simulated trajectories with the real trajectory to infer the quality of the "fit". |
step.hist.log |
set to |
nlags |
the number of time lags within which the standard deviation of the turning angles will be computed during optimization, if |
window.size |
the size (in steps of the real sampling frequency) of the time lags within which the standard deviation of the turning angles will be computed during optimization, if |
aggregate.obj.hist |
if |
step.hist.range |
the quantiles used to define the range of the step length histogram computation. Used if the user wants to exclude outliers. The default is not to exclude outliers, thus |
popsize |
number of solutions to optimize, to pass to |
generations |
number of algorithm generations to run, to pass to |
mprob |
mutation probability, to pass to |
parallel |
set to |
trace |
set to |
Details
This function finds possible parameters for the simulation (solutions), so that the resulting movements are as similar as possible
to the given real trajectory, in terms of their intrinsic properties measured by step lengths and variation in turning angles. The input parameter
approximations are found using a multiobjective genetic algorithm (NSGA-II, Deb et al. 2002). The algorithm minimizes a vector
of N objectives (N=sum(nbins.hist) if aggregate.obj.hist == FALSE, N=sum(nbins.hist > 0) otherwise)
whose values are computed by the absolute differences between each pair of bins (real and simulated) of up to three histograms:
an histogram of the step lengths, an histogram of turning angles and/or and histogram of the standard deviation in turning angles
computed in a moving time window along each trajectory. Using either of these histograms (and the respective number of bins) is
specified in the parameter nbins.hist
, which has three elements, one for each histogram. A value of 0 tells the function
not to use that corresponding histogram.
SiMRiv simulations are intended to reproduce the fine-scale movement steps, unlike what is normally collected in field data.
Hence, simulations should be conducted with a much higher time frequency than provided by the real data, which poses challenges for parameterization.
This function incorporates this difference in the time scale during optimization (resolution
), allowing the user
to find the input parameters for simulations at a much higher frequency, which, when downsampled to the real data's time
frequency, will present similar patterns. The higher the frequency, the more flexibility the model has to adjust to real data,
but the more possible solutions may exist to achieve the same result.
There are no limits to the number of parameters that can be approximated, but obviously, the higher the number, the larger the solution space, so,
in theory, the longer the algorithm has to run in order to converge.
The number of parameters to approximate is defined by the user by providing a speciesModel
. This defines how many states and which types of states are to be "fit" to the data.
See speciesModel
for details. Trials have shown that even when the number of parameters to approximate is high (e.g. 12 parameters
for "fitting" a 3-state movement model), the algorithm converges rapidly if the real movement suits such model.
However, as in any other method, a compromise should be sought. A good starting point is to provide a two-state species model,
in which both states are Correlated Random Walks. This model involves the approximation of 6 parameters and is sufficiently flexible for simulating a variety of movements, while not overly complex.
Note that all complex models can accomodate to simpler ones (i.e. the simpler models are special cases of the complex ones).
To assess the convergence of the algorithm, an utility plotting function is provided, see the example below and generationPlot
for details.
Value
The object returned by nsga2
(package mco
), see details therein. If generations
is a vector (which is recommended,
for assessing convergence), this object contains the approximated input parameter values in each generation given in the vector. See examples
for easily plotting results.
Note
This function is an experimental feature, here provided only to guide the user on how to parameterize the simulations. Care must be taken when interpreting the results, at least by assessing algorithm convergence and visually comparing simulations with the approximated parameters to the real data (see examples).
References
Deb, K., Pratap, A., Agarwal, S., & Meyarivan, T. 2002. A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Transactions on Evolutionary Computation, 6(2), 182-197.
See Also
Examples
## Not run:
library(SiMRiv)
library(adehabitatLT)
## simulate "real" data, a Levy walk, for which we want to
## parameterize our model
real.data <- simm.levy(1:500)[[1]][, 1:2]
## Define a species model to adjust. Let's assume we don't know
## much about what kind of real data we have, hence define
## a flexible model: a two-state correlated random walk model
## with variable step lengths.
## This model implies "estimating" 6 parameters:
## - turning angle correlation of state 1 [0, 1]
## - turning angle correlation of state 2 [0, 1]
## - switching probability S1 -> S2 [0, 1]
## - switching probability S2 -> S1 [0, 1]
## - maximum step length of state 1 [0, ?]
## - maximum step length of state 2 [0, ?]
## Let's assume we want to simulate at a 20 times higher time frequency
## than real data.
## In order to allow our model to adjust to real data, we have
## to provide a maximum allowable step length to the optimization algorithm
## that allows to recover real data after downsampling 20 times.
## Let's make a simple calculation of the step lengths of real data:
tmp <- sampleMovement(real.data)
## and compute a good maximum allowed step length during optimization
## using the observed maximum divided by 20 (because each real step
## will comprise 20 simulated steps)
max.step.length <- max(tmp$stat[, "steplengths"]) / 20
## and finally build the species model with it.
## Note: "CRW.CRW.sl" is the short name for the model we want,
## as defined above
species.model <- speciesModel("CRW.CRW.sl", steplength = max.step.length)
## now run optimization
sol <- adjustModel(real.data, species.model, resol = 20
, nbins.hist = c(3, 3, 0), step.hist.log = TRUE)
## After finishing, we can extract the input parameters of the optimized
## solutions (100 by default) in the last generation (generation 1000
## by default):
pars <- sol[[length(sol)]]$par
## now we can take the optimized solutions and reconstruct species
## based on them:
optimized.species <- apply(pars, 1, species.model)
## and make some simulations with those optimized species.
## Plot real trajectory
par(mfrow = c(2, 2), mar = c(0, 0, 1, 0))
plot(real.data, type = "l", asp = 1, axes = F, main = "Real")
## plot three simulated trajectories with optimized species
for(i in 1:3) {
# remember we want to simulate at a 20 times higher frequency
# so we do 500 (real data) x 20 steps
sim <- simulate(optimized.species[[i]], 500 * 20)
# now we downsample frequency to match real
samp <- sampleMovement(sim, 20)
# and plot the simulated trajectory before and after
# downsampling 20 times.
plot(sim[, 1:2], type = "l", asp = 1, axes = F, col = "gray"
, main = "Simulated and downsampled")
lines(samp$relocs, col = "black")
}
## Now plot the evolution of parameters along algorithm's generations.
## This is good to assess whether the final solutions converged
## but see ?generationPlot for details
generationPlot(sol, species.model)
## End(Not run)