R: Jointly infer constant and time-dependent parameters of a...

infer.timedeppar {timedeppar}

R Documentation

Jointly infer constant and time-dependent parameters of a dynamic model given time-series data

Description

This function draws a Markov Chain from the posterior of constant and time-dependent parameters (following Ornstein-Uhlenbeck processes) of a dynamic model. The dynamic model is specified by a function that calculates the log likelihood for given time-series data. The Ornstein-Uhlenbeck processes of time-dependent processes are characterized by there mean (mean), standard deviation (sd), and a rate parameter (gamma) that quantifies temporal correlation.

Usage

infer.timedeppar(
  loglikeli = NULL,
  loglikeli.keepstate = FALSE,
  param.ini = list(),
  param.range = list(),
  param.log = logical(0),
  param.logprior = NULL,
  param.ou.ini = numeric(0),
  param.ou.fixed = numeric(0),
  param.ou.logprior = NULL,
  task = c("start", "continue", "restart"),
  n.iter = NA,
  cov.prop.const.ini = NA,
  cov.prop.ou.ini = NA,
  scale.prop.const.ini = NA,
  scale.prop.ou.ini = NA,
  control = list(),
  res.infer.timedeppar = list(),
  verbose = 0,
  file.save = "",
  ...
)

Arguments

`loglikeli`	function that calculates the log likelihood of the model for given constant or time-dependent parameters and given observational data. The parameters are passed as a named list in the first argument of the function. The list elements are either scalar values representing constant parameters or two-column matrices with columns for time points and values for time-dependent parameters. If the argument `loglikeli.keepstate` is `FALSE` no further arguments are needed (but can be provided, see below). In this case, the function should return the log likelihood as a single double value. If the argument `loglikeli.keepstate` is `TRUE`, the second argument provides the time range over which a time-dependent parameter was changed or NA if the full simulation time has to be evaluated, and the third argument provides the state of the functon at the last successful call. This allows the function to only calculate and return modifications to that previous state. In this case, the function has to return a list with the log likelihood value as its first element and the current state of the function as the second argument. This state can be an R variable of an arbitrary data type. The version from the last accepted MCMC step will be returnde at the next call. Further arguments provided to `infer.timedeppar` will be passed to this function.
`loglikeli.keepstate`	boolean to indicate which kind of interface to the likelihood function is used. See argument `loglikeli` for details.
`param.ini`	named list of initial vallues of parameters to be estimated. scalar initial values for constant parameters, two-column matrices for time and parameter values for time-dependent parameters (values of time-dependent parameters may be NA and are then drawn from the Ornstein-Uhlenbeck process). The list `param.ini` needs to be a legal and complete first element of the function passed by the argument `loglikeli`. For each time-dependent parameter with name `<name>` initial values or fixed value of the parameters `<name>_mean`, `<name>_sd` and `<name>_gamma` must be provided in the arguments `param.ou.ini` or `param.ou.fixed`, respectively. These parameters represent the mean, the asymptotic standard deviation, and the rate parameter of the Ornstein-Uhlenbeck process. If these parameters are given in the argument `param.ou.ini`, they are used as initial condition of the inference process and the parameters are estimated, if they are given in the argument `param.ou.fixed`, they are assumed to be given and are kept fixed.
`param.range`	named list of ranges (2 element vectors with minimum and maximum) of parameters that are constrained (non-logarithmic for all parameters)
`param.log`	named vector of logicals indicating if inference should be done on the log scale (parameters are still given and returned on non-log scales). For time-dependent parameters, selecting this option implies the use of a lognormal marginal for the Ornstein-Uhlenbeck process. This means that the parameter is modelled as exp(Ornstein-Uhlenbeck), but mean and standard devistion of the process are still on non-log scales.
`param.logprior`	function to calculate the (joint) log prior of all estimaged constant parameters of the model. The function gets as its argument a named vector of the values of the estimaged constant parameters to allow the function to identify for which parameters a joint prior is required in the current setting).
`param.ou.ini`	named vector of initial values of parameters of the Ornstein-Uhlenbeck processes of time-dependent parameters; see description of argument `param.ini`.
`param.ou.fixed`	named vector of values of parameters of the Ornstein-Uhlenbeck processes of time-dependent parameters that are kept fixed rather than being extimated. If all process parameters are kept fixed, these names are `<name>_mean`, `<name>_sd` and `<name>_gamma` for each time-dependent parameter with name `<name>`; see description of the argument `param.ini`. The values specified in `param.fixed` are ignored if the parameter is also given in the argument `param.ini`; in this case it is estimated.
`param.ou.logprior`	function to calculate the (joint) log prior of all estimated parameters of the Ornstein-Uhlenbeck processes of a single time-dependent parameter. The function gets as its argument a named vector of the values of the process parameters to estimated. These names are a subset of `<name>_mean`, `<name>_sd` and `<name>_gamma`; see description of the argument `param.ini`. The function has to work for each time-dependent parameter by being sensitive to the parameter names.
`task`	Which task to perform (default value: "start"): `"start"`: start an inference process from scratch based on the arguments of the function. The argument `res.infer.timedeppar` is ignored. `"continue"`: continue a Markov chain from a previous call to `infer.timedeppar`. The results of a previous call have to be provided as the argument `res.infer.timedeppar` in the form of an object of type `timedeppar`. To guarantee convergence of the chain, all numerical specifications including the final state of the chain are taken from the object provided by `res.infer.timedeppar` and the actual arguments of the function are ignored except `n.iter` which specifies the number of iterations to be added to the chain. `"restart"`: A new chain is started from the last point of a previous chain except if the argument `param.ini` is provided. The results of a previous call have to be provided as the argument `res.infer.timedeppar` in the form of an object of type `timedeppar`. Likelihood, prior pdf functions, initial proposal covariance matrices, scales and control parameters are taken from the previous chain unless explicitly provided.
`n.iter`	number of iterations of the Markov chain to be performed (default value: 10000).
`cov.prop.const.ini`	scaled covariance matrix of the proposal distribution for the Metropolis step of constant parameters. The proposal distribution of the Metropolis step is a normal distribution centered at the last point of the chain with a covariance matrix equal to `scale.prop.const^2 * cov.prop.const`. Note that if `param.log` is `TRUE` for a parameter, then the proposal is evaluated at the log scale of the parameter. During the adaptation phase, the covariance matrix is periodically adapted to the covariance matrix of the current sample and the scale to get a reasonable acceptance rate. After the adaptation phase, both variables are kept constant to guarantee convergence.
`cov.prop.ou.ini`	list of scaled covariance matrices of the proposal distributions for the Metropolis step of the parameters of the Ornstein-Uhlenbeck processes of time-dependent parameters. The proposal distribution of the Metropolis step for the process parameters of the time-dependent parameter i is a normal distribution centered at the last point of the chain with a covariance matrix equal to `cov.prop.ou[[i]] * scale.prop.ou[i]^2`. Note that if `param.log` is `TRUE` for a parameter, then the proposal for the mean is evaluated at the log scale of the parameter. This is anyway the case for the standard deviation and the rate parameter of the Ornstein-Uhlenbeck process. During the adaptation phase, the covariance matrices are periodically adapted to the covariance matrix of the current sample and the scale to get a reasonable acceptance rate. After the adaptation phase, both variables are kept constant to guarantee convergence.
`scale.prop.const.ini`	scale factor for the covariance matrix of the Metropolis step for constant parameters with a proposal distribution equal to a normal distribution centered at the previous point of the chain and a covariance matrix equal to `scale.prop.const^2 * cov.prop.const`. During the adaptation phase, the covariance matrix is periodically adapted to the covariance matrix of the current sample and the scale to get a reasonable acceptance rate. After the adaptation phase, both variables are kept constant to guarantee convergence.
`scale.prop.ou.ini`	vector of scale factors for the covariance matrices of the Metropolis step for parameters of Ornstein-Uhlenbeck processes with a proposal distribution equal to a normal distribution centered at the previous point of the chain and a covariance matrix for the time dependent parameter i equal to `cov.prop.ou[[i]] * scale.prop.ou[i]^2`. During the adaptation phase, the covariance matrix is periodically adapted to the covariance matrix of the current sample and the scale to get a reasonable acceptance rate. After the adaptation phase, both variables are kept constant to guarantee convergence.
`control`	list of control parameters of the algorithm: `n.interval`: number of sub-intervals into which the time domain is splitted to infer the time-dependent parameters; either scalar for universal choice for all parameters or named vector for parameter-specific choices (default value: 50; this number must be increased if the acceptance rates of the time-dependent parameters are very low, it can be decreased if they are high); `min.internal`: minimum number of internal points in an interval (default value: 1; may be increased if time resolution is high). `splitmethod`: method used for random splitting of time domain into sub-intervals. Possible values: `"modunif"`: modification of uniform intervals; `"random"`: random split (higher variability in inverval lengths); `"weighted"`: weighted random split leading to shorter intervals where the acceptance frequency is low; `"autoweights"`: use weighted random split but adjusts weights adaptively. (default value: `"modunif"`). `interval.weights`: numerical vector or named list of numerical vectors (by time-dependent parameter) of weights for sampling interval boundaries (the length(s) of the vector(s) must be equal to the time series length in the parameter specification). The weight vectors do not have to be normalized. The weights are used if the parameter `splitmethod` is equal to `"weighted"` or as optional initial weights if `splitmethod` is equal to `"autoweights"`. `n.autoweighting`: number of past iterations to consider for weight calculation for `splitmethod` `"autoweights"` (default value: 1000). Note that the calculation of weights only starts after `n.autoweights` iterations and that only stored points are considerd so that the number of points considered is equal to `n.autoweighting`/`thin`. `offset.weighting`: offset used to caluclate weights from apparent acceptance frequencies for `splitmethod` `"autoweights"` (default value: 0.05). `n.widening`: number grid points used to widen areas of high weight for `splitmethod` `"autoweights"` (default value: 10). `n.timedep.perstep`: number of updates of the time-dependent parameter(s) before updating the constant parameters (default value: 1). `n.const.perstep`: number of Markov chain steps for the constant parameters to be performed between updating the time-dependent parameters (default value: 1). `n.adapt`: number of iterations of the Markov chain during which adaptation is made (default value: 2000; only during this phase, the covariance matrix and the scaling factors are adapted). `n.adapt.scale`: number of iterations after which the acceptance rate is checked for potentially adapting the scaling factor (default value: 30). `n.adapt.cov`: number of iterations of the Markov chain, after which the covariance matrix of the proposal distribution is adapted (default value: 900; 0 means no adaptation of the covariance matrix; note that after `control$n.adapt` iterations adaptation is turned off; for this reason, after the last multiple of `n.adapt.cov` below `n.adapt` there should be sufficient iterations left to adapt the scaling factors). `f.reduce.cor`: factor by which sample correlations are reduced when constructing the covariance matrix of the proposal distribution (default value: 0.90). `f.accept.decscale`: acceptance rate below which the proposal scaling factor is decreased during the adaptation phase (default value: 0.05). `f.accept.incscale`: acceptance rate above which the proposal scaling factor is increased during the adaptation phase (default value: 0.30). `f.max.scalechange`: max. factor for changing proposal distribution scale from reference (default value: 10; reference is either initial value or modified value when the covariance matrix was adapted). `f.sample.cov.restart`: fraction of previous samples to be used to calculate the covariance matrix of proposal distribution when restarting inference (default value: 0.3; the last part of the samples is used). `thin`: thinning for storing Markov chain results (default value: 1). `n.save`: number of iterations after which the results are (periodically) saved (default value: 1000). `save.diag`: save diagnostic information about acceptance ratio, acceptance, and interval lengths for inference of the time-dependent parameters.
`res.infer.timedeppar`	results of a previous call to this function. These results are ignored if the argument `task` is equal to `"start"`, but it is needed for the tasks `"continue"` and `"restart"`.
`verbose`	integer parameter indicating the level of progress reporting: 0: no reporting; 1: reporting of thinned and accepted Markov Chain steps and of adapted proposal covariance matrices; 2: reporting of proposals and accepted steps before thinning.
`file.save`	if non-empty string, the intermediate results are saved to this file as variable `res` in a workspace after every `control$n.save` iterations (the extension `.RData` will be appended to the file name).
`...`	additional parameters passed to the function `loglikeli`.

Value

class of type timedeppar with the following elements:

package: package timedeppar: version and date,
func: function called (infer.timedeppar),
date: date of call,
dot.args: arguments passed to the likelihood function (included for reproducibility of results),
task: task that was performed (start, restart or continue),
file: name of file to which output was written,
param.ini: initial values of likelihood parameters (constant and time-dependent),
param.ou.ini: initial values of Ornstein-Uhlenbeck process parameters that are estimated,
param.ou.fixed: values of Ornstein-Uhlenbeck process parameters that are not estimated,
loglikeli: function that was passed to calculate the log likelihood of the observations,
loglikeli.keepstate: boolean indicating whether or not the state from the previous run should be kept (this allows only partial time evaluation when only part of the input was replaced),
param.logprior: function that was passed to calculate the joint log prior of the constant likelihood parameters,
param.ou.logprior: function that was passed to calculate the joint log prior of the estimated Ornstein-Uhlenbeck process parameters (in case of multiple Ornstein-Uhlenbeck processes the function has to return the prior for the correct process; this can be identified by the names of the argument),
param.range: parameter ranges,
param.log: named logical vector of indicators for log inference,
control: named list of control parameters as passed to the call (or read from a previous call),
n.iter: number of iterations peformed (note that the size of the sample will be n.iter/control$thin),
sample.diag: list of samples of proposals, log acceptance ratios, and interval lengths of time-dependent parameters (only available if the control variable save.diag is set to TRUE),
sample.param.timedep: list of samples of time dependent parameters (first row contains time points),
sample.param.ou: sample of Ornstein-Uhlenbeck process parameters,
sample.param.const: sample of constant parameters,
sample.logpdf: sample of prior, Ornstein-Uhlenbeck and posterior pdf,
acceptfreq.constpar: acceptance frequency of constant parameters after adaptation phase,
acceptfreq.oupar: acceptance frequencies of Ornstein-Uhlenbeck process parameters after adaptation phase,
acceptfreq.timedeppar: acceptance frequencies of time-depenent parameters,
param.maxpost: parameters at the maximum posterior (constant and time-dependent parameters),
param.ou.maxpost: Ornstein-Uhlenbeck process parameters at the maximum posterior,
cov.prop.const: final covariance matrix used for proposal distribution of constant parameters,
cov.prop.ou: list of final covariance matrices used for proposal distribution of Ornstein-Uhlenbeck process paramemters,
scale.prop.const: final scale of proposal distribution of constant parameters,
scale.prop.ou: final scale of proposal distribution of Ornstein-Uhlenbeck process parameters,
sys.time: run time used for the previous inference job.

References

Reichert, P. timedeppar: An R package for inferring stochastic, time-dependent model parameters in preparation, 2020.

Reichert, P., Ammann, L. and Fenicia, F. Potential and challenges of investigating intrinsic uncertainty of hydrological models with time-dependent, stochastic parameters. Water Resources Research 57(8), e2020WR028311, 2021. doi:10.1029/2020WR028311

Reichert, P. and Mieleitner, J. Analyzing input and structural uncertainty of nonlinear dynamic models with stochastic, time-dependent parameters. Water Resources Research, 45, W10402, 2009. doi:10.1029/2009WR007814

Tomassini, L., Reichert, P., Kuensch, H.-R. Buser, C., Knutti, R. and Borsuk, M.E. A smoothing algorithm for estimating stochastic, continuous-time model parameters and its application to a simple climate model. Journal of the Royal Statistical Society: Series C (Applied Statistics) 58, 679-704, 2009. doi:10.1111/j.1467-9876.2009.00678.x

Examples

# Simple example for re-inferring parameters of an Ornstein-Uhlenbeck process
# with observational noise from synthetically generated data
# ---------------------------------------------------------------------------

# load package:
if ( !require("timedeppar") ) { install.packages("timedeppar"); library(timedeppar) }

# choose model parameters:
y_mean      <-  0
y_sd        <-  1
y_gamma     <- 10
obs_sd      <-  0.2

# choose control parameters of numerical algorithm:
n.iter      <-   100   # this is just to demonstrate how it works and is compatible
# with the computation time requirements for examples in CRAN
# n.iter      <- 50000   # please go for a sample size like this
#                        # for getting a reasonable sample
n.interval  <-    25   # increase if rejection frequency of stoch. par. too high
fract.adapt <- 0.4
n.adapt     <- floor(fract.adapt*n.iter)

# synthetically generate data:
set.seed(123)
data <- randOU(mean=y_mean,sd=y_sd,gamma=y_gamma,t=seq(from=0,to=2,length.out=101))
data$yobs <- data$y + rnorm(nrow(data),mean=0,sd=obs_sd)

# define observational likelihood:
loglikeli <- function(param,data)
{
 # get parameter y at time points of observations::
 y <- param$y
 if ( is.matrix(y) | is.data.frame(y) ) y <- approx(x=y[,1],y=y[,2],xout=data[,1])$y
 # calculate likelihood:
 loglikeli <- sum(dnorm(data[,"yobs"],mean=y,sd=param$obs_sd,log=TRUE))
 # return result:
 return(loglikeli)
}

# sample from the posterior of y, mu_y, sd_y and sd_obs assuming a uniform prior:
res <- infer.timedeppar(loglikeli      = loglikeli,
                       param.ini      = list(y=randOU(mean=y_mean,sd=y_sd,gamma=y_gamma,
                                                      t=seq(from=0,to=2,length.out=501)),
                                             obs_sd=obs_sd),
                       param.log      = c(y=FALSE,obs_sd=TRUE),
                       param.ou.ini   = c(y_mean=0,y_sd=1),
                       param.ou.fixed = c(y_gamma=10),
                       n.iter         = n.iter,
                       control        = list(n.interval = n.interval,
                                             n.adapt    = n.adapt),
                       data = data)

# plot results using pre-defined options:
# pdf(paste0("infer_OU_",n.iter,"_",n.adapt,"_original.pdf"),width=8,height=12)
plot(res,
    labels=expression(y       = y,
                      y_mean  = mu[y],
                      y_sd    = sigma[y],
                      y_gamma = gamma[y]))
# dev.off()

# plot time series and data:
# pdf(paste0("infer_OU_",n.iter,"_",n.adapt,"_comparison.pdf"),width=8,height=6)
t <- res$sample.param.timedep$y[1,]
sample <- res$sample.param.timedep$y[(1+n.adapt+1):nrow(res$sample.param.timedep$y),]
q <- apply(sample,2,quantile,probs=c(0.025,0.5,0.975))
plot(numeric(0),numeric(0),type="n",xaxs="i",yaxs="i",
    xlim=range(t),ylim=2.5*c(-1,1),xlab="t",ylab="y")
polygon(c(t,rev(t),t[1]),c(q[1,],rev(q[3,]),q[1,1]),
       col="grey80",border=NA)
lines(t,q[2,])
lines(data$t,data$y,col="red")
points(data$t,data$yobs,pch=19,cex=0.8)
legend("bottomright",
      legend=c("original process","noisy data","inferred median","inferred 95% range"),
      lwd=c(1,NA,1,5),lty=c("solid",NA,"solid","solid"),col=c("red","black","black","grey80"),
      pch=c(NA,19,NA,NA),cex=0.8)
# dev.off()

[Package timedeppar version 1.0.3 Index]