R: Random sampling from extreme value posterior distributions

rpost {revdbayes}

R Documentation

Random sampling from extreme value posterior distributions

Description

Uses the ru function in the rust package to simulate from the posterior distribution of an extreme value model.

Usage

rpost(
  n,
  model = c("gev", "gp", "bingp", "pp", "os"),
  data,
  prior,
  ...,
  nrep = NULL,
  thresh = NULL,
  noy = NULL,
  use_noy = TRUE,
  npy = NULL,
  ros = NULL,
  bin_prior = structure(list(prior = "bin_beta", ab = c(1/2, 1/2), class = "binprior")),
  bin_param = "logit",
  init_ests = NULL,
  mult = 2,
  use_phi_map = FALSE,
  weights = NULL
)

Arguments

`n`	A numeric scalar. The size of posterior sample required.
`model`	A character string. Specifies the extreme value model.
`data`	Sample data, of a format appropriate to the value of `model`. `"gp"`. A numeric vector of threshold excesses or raw data. `"bingp"`. A numeric vector of raw data. `"gev"`. A numeric vector of block maxima. `"pp"`. A numeric vector of raw data. `"os"`. A numeric matrix or data frame. Each row should contain the largest order statistics for a block of data. These need not be ordered: they are sorted inside `rpost`. If a block contains fewer than `dim(as.matrix(data))[2]` order statistics then the corresponding row should be padded by `NA`s. If `ros` is supplied then only the largest `ros` values in each row are used. If a vector is supplied then this is converted to a matrix with one column. This is equivalent to using `model = "gev"`.
`prior`	A list specifying the prior for the parameters of the extreme value model, created by `set_prior`.
`...`	Further arguments to be passed to `ru`. Most importantly `trans` and `rotate` (see Details), and perhaps `r`, `ep`, `a_algor`, `b_algor`, `a_method`, `b_method`, `a_control`, `b_control`. May also be used to pass the arguments `n_grid` and/or `ep_bc` to `find_lambda`.
`nrep`	A numeric scalar. If `nrep` is not `NULL` then `nrep` gives the number of replications of the original dataset simulated from the posterior predictive distribution. Each replication is based on one of the samples from the posterior distribution. Therefore, `nrep` must not be greater than `n`. In that event `nrep` is set equal to `n`. Currently only implemented if `model = "gev"` or `"gp"` or `"bingp"` or `"pp"`, i.e. not implemented if `model = "os"`.
`thresh`	A numeric scalar. Extreme value threshold applied to data. Only relevant when `model = "gp"`, `"pp"` or `"bingp"`. Must be supplied when `model = "pp"` or `"bingp"`. If `model = "gp"` and `thresh` is not supplied then `thresh = 0` is used and `data` should contain threshold excesses.
`noy`	A numeric scalar. The number of blocks of observations, excluding any missing values. A block is often a year. Only relevant, and must be supplied, if `model = "pp"`.
`use_noy`	A logical scalar. Only relevant if model is "pp". If `use_noy = FALSE` then sampling is based on a likelihood in which the number of blocks (years) is set equal to the number of threshold excesses, to reduce posterior dependence between the parameters (Wadsworth et al., 2010). The sampled values are transformed back to the required parameterisation before returning them to the user. If `use_noy = TRUE` then the user's value of `noy` is used in the likelihood.
`npy`	A numeric scalar. The mean number of observations per year of data, after excluding any missing values, i.e. the number of non-missing observations divided by total number of years' worth of non-missing data. The value of `npy` does not affect any calculation in `rpost`, it only affects subsequent extreme value inferences using `predict.evpost`. However, setting `npy` in the call to `rpost` avoids the need to supply `npy` when calling `predict.evpost`. This is likely to be useful only when `model = bingp`. See the documentation of `predict.evpost` for further details.
`ros`	A numeric scalar. Only relevant when `model = "os"`. The largest `ros` values in each row of the matrix `data` are used in the analysis.
`bin_prior`	A list specifying the prior for a binomial probability `p`, created by `set_bin_prior`. Only relevant if `model = "bingp"`. If this is not supplied then the Jeffreys beta(1/2, 1/2) prior is used.
`bin_param`	A character scalar. The argument `param` passed to `binpost`. Only relevant if a user-supplied prior function is set using `set_bin_prior`.
`init_ests`	A numeric vector. Initial parameter estimates for search for the mode of the posterior distribution.
`mult`	A numeric scalar. The grid of values used to choose the Box-Cox transformation parameter lambda is based on the maximum a posteriori (MAP) estimate +/- mult x estimated posterior standard deviation.
`use_phi_map`	A logical scalar. If trans = "BC" then `use_phi_map` determines whether the grid of values for phi used to set lambda is centred on the maximum a posterior (MAP) estimate of phi (`use_phi_map = TRUE`), or on the initial estimate of phi (`use_phi_map = FALSE`).
`weights`	An optional numeric vector of weights by which to multiply the observations when constructing the log-likelihood. Currently only implemented for `model = "gp"` or `model = "bingp"`. In the latter case `bin_prior$prior` must be `"bin_beta"`. `weights` must have the same length as `data`.

Details

Generalised Pareto (GP): model = "gp". A model for threshold excesses. Required arguments: n, data and prior. If thresh is supplied then only the values in data that exceed thresh are used and the GP distribution is fitted to the amounts by which those values exceed thresh. If thresh is not supplied then the GP distribution is fitted to all values in data, in effect thresh = 0. See also gp.

Binomial-GP: model = "bingp". The GP model for threshold excesses supplemented by a binomial(length(data), p) model for the number of threshold excesses. See Northrop et al. (2017) for details. Currently, the GP and binomial parameters are assumed to be independent a priori.

Generalised extreme value (GEV) model: model = "gev". A model for block maxima. Required arguments: n, data, prior. See also gev.

Point process (PP) model: model = "pp". A model for occurrences of threshold exceedances and threshold excesses. Required arguments: n, data, prior, thresh and noy.

r-largest order statistics (OS) model: model = "os". A model for the largest order statistics within blocks of data. Required arguments: n, data, prior. All the values in data are used unless ros is supplied.

Parameter transformation. The scalar logical arguments (to the function ru) trans and rotate determine, respectively, whether or not Box-Cox transformation is used to reduce asymmetry in the posterior distribution and rotation of parameter axes is used to reduce posterior parameter dependence. The default is trans = "none" and rotate = TRUE.

See the Introducing revdbayes vignette for further details and examples.

Value

An object (list) of class "evpost", which has the same structure as an object of class "ru" returned from ru. In addition this list contains

`model:`	The argument `model` to `rpost` detailed above.
`data:`	The content depends on `model`: if `model = "gev"` then this is the argument `data` to `rpost` detailed above, with missing values removed; if `model = "gp"` then only the values that lie above the threshold are included; if `model = "bingp"` or `model = "pp"` then the input data are returned but any value lying below the threshold is set to `thresh`; if `model = "os"` then the order statistics used are returned as a single vector.
`prior:`	The argument `prior` to `rpost` detailed above.

If nrep is not NULL then this list also contains data_rep, a numerical matrix with nrep rows. Each row contains a replication of the original data data simulated from the posterior predictive distribution. If model = "bingp" or "pp" then the rate of threshold exceedance is part of the inference. Therefore, the number of values in data_rep that lie above the threshold varies between predictive replications (different rows of data_rep). Values below the threshold are left-censored at the threshold, i.e. they are set at the threshold.

If model == "pp" then this list also contains the argument noy to rpost detailed above. If the argument npy was supplied then this list also contains npy.

If model == "gp" or model == "bingp" then this list also contains the argument thresh to rpost detailed above.

If model == "bingp" then this list also contains

`bin_sim_vals:`	An `n` by 1 numeric matrix of values simulated from the posterior for the binomial probability `p`
`bin_logf:`	A function returning the log-posterior for `p`.
`bin_logf_args:`	A list of arguments to `bin_logf`.

References

Coles, S. G. and Powell, E. A. (1996) Bayesian methods in extreme value modelling: a review and new developments. Int. Statist. Rev., 64, 119-136.

Northrop, P. J., Attalides, N. and Jonathan, P. (2017) Cross-validatory extreme value threshold selection and uncertainty with application to ocean storm severity. Journal of the Royal Statistical Society Series C: Applied Statistics, 66(1), 93-120. doi:10.1111/rssc.12159

Stephenson, A. (2016) Bayesian Inference for Extreme Value Modelling. In Extreme Value Modeling and Risk Analysis: Methods and Applications, edited by D. K. Dey and J. Yan, 257-80. London: Chapman and Hall. doi:10.1201/b19721 value posterior using the evdbayes package.

Wadsworth, J. L., Tawn, J. A. and Jonathan, P. (2010) Accounting for choice of measurement scale in extreme value modeling. The Annals of Applied Statistics, 4(3), 1558-1578. doi:10.1214/10-AOAS333

Examples


# GP model
u <- quantile(gom, probs = 0.65)
fp <- set_prior(prior = "flat", model = "gp", min_xi = -1)
gpg <- rpost(n = 1000, model = "gp", prior = fp, thresh = u, data = gom)
plot(gpg)

# Binomial-GP model
u <- quantile(gom, probs = 0.65)
fp <- set_prior(prior = "flat", model = "gp", min_xi = -1)
bp <- set_bin_prior(prior = "jeffreys")
bgpg <- rpost(n = 1000, model = "bingp", prior = fp, thresh = u, data = gom,
              bin_prior = bp)
plot(bgpg, pu_only = TRUE)
plot(bgpg, add_pu = TRUE)

# Setting the same binomial (Jeffreys) prior by hand
beta_prior_fn <- function(p, ab) {
  return(stats::dbeta(p, shape1 = ab[1], shape2 = ab[2], log = TRUE))
}
jeffreys <- set_bin_prior(beta_prior_fn, ab = c(1 / 2, 1 / 2))
bgpg <- rpost(n = 1000, model = "bingp", prior = fp, thresh = u, data = gom,
              bin_prior = jeffreys)
plot(bgpg, pu_only = TRUE)
plot(bgpg, add_pu = TRUE)

# GEV model
mat <- diag(c(10000, 10000, 100))
pn <- set_prior(prior = "norm", model = "gev", mean = c(0, 0, 0), cov = mat)
gevp  <- rpost(n = 1000, model = "gev", prior = pn, data = portpirie)
plot(gevp)

# GEV model, informative prior constructed on the probability scale
pip  <- set_prior(quant = c(85, 88, 95), alpha = c(4, 2.5, 2.25, 0.25),
                  model = "gev", prior = "prob")
ox_post <- rpost(n = 1000, model = "gev", prior = pip, data = oxford)
plot(ox_post)

# PP model
pf <- set_prior(prior = "flat", model = "gev", min_xi = -1)
ppr <- rpost(n = 1000, model = "pp", prior = pf, data = rainfall,
             thresh = 40, noy = 54)
plot(ppr)

# PP model, informative prior constructed on the quantile scale
piq <- set_prior(prob = 10^-(1:3), shape = c(38.9, 7.1, 47),
                 scale = c(1.5, 6.3, 2.6), model = "gev", prior = "quant")
rn_post <- rpost(n = 1000, model = "pp", prior = piq, data = rainfall,
                 thresh = 40, noy = 54)
plot(rn_post)

# OS model
mat <- diag(c(10000, 10000, 100))
pv <- set_prior(prior = "norm", model = "gev", mean = c(0, 0, 0), cov = mat)
osv <- rpost(n = 1000, model = "os", prior = pv, data = venice)
plot(osv)

[Package revdbayes version 1.5.4 Index]