R: Selecting the Box-Cox parameter for general d using Rcpp

find_lambda_rcpp {rust}

R Documentation

Selecting the Box-Cox parameter for general d using Rcpp

Description

Finds a value of the Box-Cox transformation parameter lambda for which the (positive) random variable with log-density \log f has a density closer to that of a Gaussian random variable. In the following we use theta (\theta) to denote the argument of logf on the original scale and phi (\phi) on the Box-Cox transformed scale.

Usage

find_lambda_rcpp(
  logf,
  ...,
  d = 1,
  n_grid = NULL,
  ep_bc = 1e-04,
  min_phi = rep(ep_bc, d),
  max_phi = rep(10, d),
  which_lam = 1:d,
  lambda_range = c(-3, 3),
  init_lambda = NULL,
  phi_to_theta = NULL,
  log_j = NULL,
  user_args = list()
)

Arguments

`logf`	A pointer to a compiled C++ function returning the log of the target density `f`.
`...`	further arguments to be passed to `logf` and related functions.
`d`	A numeric scalar. Dimension of `f`.
`n_grid`	A numeric scalar. Number of ordinates for each variable in `phi`. If this is not supplied a default value of `ceiling(2501 ^ (1 / d))` is used.
`ep_bc`	A (positive) numeric scalar. Smallest possible value of `phi` to consider. Used to avoid negative values of `phi`.
`min_phi`, `max_phi`	Numeric vectors. Smallest and largest values of `phi` at which to evaluate `logf`, i.e., the range of values of `phi` over which to evaluate `logf`. Any components in `min_phi` that are not positive are set to `ep_bc`.
`which_lam`	A numeric vector. Contains the indices of the components of `phi` that ARE to be Box-Cox transformed.
`lambda_range`	A numeric vector of length 2. Range of lambda over which to optimise.
`init_lambda`	A numeric vector of length 1 or `d`. Initial value of lambda used in the search for the best lambda. If `init_lambda` is a scalar then `rep(init_lambda, d)` is used.
`phi_to_theta`	A pointer to a compiled C++ function returning (the inverse) of the transformation from `theta` to `phi` used to ensure positivity of `phi` prior to Box-Cox transformation. The argument is `phi` and the returned value is `theta`. If `phi_to_theta` is undefined at the input value then the function should return `NA`.
`log_j`	A pointer to a compiled C++ function returning the log of the Jacobian of the transformation from `theta` to `phi`, i.e., based on derivatives of `phi` with respect to `theta`. Takes `theta` as its argument.
`user_args`	A list of numeric components providing arguments to the user-supplied functions `phi_to_theta` and `log_j`.

Details

The general idea is to evaluate the density f on a d-dimensional grid, with n_grid ordinates for each of the d variables. We treat each combination of the variables in the grid as a data point and perform an estimation of the Box-Cox transformation parameter lambda, in which each data point is weighted by the density at that point. The vectors min_phi and max_phi define the limits of the grid and which_lam can be used to specify that only certain components of phi are to be transformed.

Value

A list containing the following components

`lambda`	A numeric vector. The value of `lambda`.
`gm`	A numeric vector. Box-cox scaling parameter, estimated by the geometric mean of the values of phi used in the optimisation to find the value of lambda, weighted by the values of f evaluated at phi.
`init_psi`	A numeric vector. An initial estimate of the mode of the Box-Cox transformed density
`sd_psi`	A numeric vector. Estimates of the marginal standard deviations of the Box-Cox transformed variables.
`phi_to_theta`	as detailed above (only if `phi_to_theta` is supplied)
`log_j`	as detailed above (only if `log_j` is supplied)
`user_args`	as detailed above (only if `user_args` is supplied)

References

Box, G. and Cox, D. R. (1964) An Analysis of Transformations. Journal of the Royal Statistical Society. Series B (Methodological), 26(2), 211-252.

Andrews, D. F. and Gnanadesikan, R. and Warner, J. L. (1971) Transformations of Multivariate Data, Biometrics, 27(4).

Eddelbuettel, D. and Francois, R. (2011). Rcpp: Seamless R and C++ Integration. Journal of Statistical Software, 40(8), 1-18. doi:10.18637/jss.v040.i08

Eddelbuettel, D. (2013). Seamless R and C++ Integration with Rcpp, Springer, New York. ISBN 978-1-4614-6867-7.

Examples


# Log-normal density ===================
# Note: the default value max_phi = 10 is OK here but this will not always
# be the case
ptr_lnorm <- create_xptr("logdlnorm")
mu <- 0
sigma <- 1
lambda <- find_lambda_rcpp(logf = ptr_lnorm, mu = mu, sigma = sigma)
lambda
x <- ru_rcpp(logf = ptr_lnorm, mu = mu, sigma = sigma, d = 1, n = 1000,
             trans = "BC", lambda = lambda)

# Gamma density ===================
alpha <- 1
#  Choose a sensible value of max_phi
max_phi <- qgamma(0.999, shape = alpha)
# [Of course, typically the quantile function won't be available.  However,
# In practice the value of lambda chosen is quite insensitive to the choice
# of max_phi, provided that max_phi is not far too large or far too small.]

ptr_gam <- create_xptr("logdgamma")
lambda <- find_lambda_rcpp(logf = ptr_gam, alpha = alpha, max_phi = max_phi)
lambda
x <- ru_rcpp(logf = ptr_gam, alpha = alpha, d = 1, n = 1000, trans = "BC",
             lambda = lambda)


# Generalized Pareto posterior distribution ===================

n <- 1000
# Sample data from a GP(sigma, xi) distribution
gpd_data <- rgpd(m = 100, xi = -0.5, sigma = 1)
# Calculate summary statistics for use in the log-likelihood
ss <- gpd_sum_stats(gpd_data)
# Calculate an initial estimate
init <- c(mean(gpd_data), 0)

n <- 1000
# Sample on original scale, with no rotation ----------------
ptr_gp <- create_xptr("loggp")
for_ru_rcpp <- c(list(logf = ptr_gp, init = init, d = 2, n = n,
                     lower = c(0, -Inf)), ss, rotate = FALSE)
x1 <- do.call(ru_rcpp, for_ru_rcpp)
plot(x1, xlab = "sigma", ylab = "xi")
# Parameter constraint line xi > -sigma/max(data)
# [This may not appear if the sample is far from the constraint.]
abline(a = 0, b = -1 / ss$xm)
summary(x1)

# Sample on original scale, with rotation ----------------
for_ru_rcpp <- c(list(logf = ptr_gp, init = init, d = 2, n = n,
                      lower = c(0, -Inf)), ss)
x2 <- do.call(ru_rcpp, for_ru_rcpp)
plot(x2, xlab = "sigma", ylab = "xi")
abline(a = 0, b = -1 / ss$xm)
summary(x2)

# Sample on Box-Cox transformed scale ----------------

# Find initial estimates for phi = (phi1, phi2),
# where phi1 = sigma
#   and phi2 = xi + sigma / max(x),
# and ranges of phi1 and phi2 over over which to evaluate
# the posterior to find a suitable value of lambda.
temp <- do.call(gpd_init, ss)
min_phi <- pmax(0, temp$init_phi - 2 * temp$se_phi)
max_phi <- pmax(0, temp$init_phi + 2 * temp$se_phi)

# Set phi_to_theta() that ensures positivity of phi
# We use phi1 = sigma and phi2 = xi + sigma / max(data)

# Create an external pointer to this C++ function
ptr_phi_to_theta_gp <- create_phi_to_theta_xptr("gp")
# Note: log_j is set to zero by default inside find_lambda_rcpp()
lambda <- find_lambda_rcpp(logf = ptr_gp, ss = ss, d = 2, min_phi = min_phi,
                           max_phi = max_phi, user_args = list(xm = ss$xm),
                           phi_to_theta = ptr_phi_to_theta_gp)
lambda

# Sample on Box-Cox transformed, without rotation
x3 <- ru_rcpp(logf = ptr_gp, ss = ss, d = 2, n = n, trans = "BC",
              lambda = lambda, rotate = FALSE)
plot(x3, xlab = "sigma", ylab = "xi")
abline(a = 0, b = -1 / ss$xm)
summary(x3)

# Sample on Box-Cox transformed, with rotation
x4 <- ru_rcpp(logf = ptr_gp, ss = ss, d = 2, n = n, trans = "BC",
              lambda = lambda)
plot(x4, xlab = "sigma", ylab = "xi")
abline(a = 0, b = -1 / ss$xm)
summary(x4)

def_par <- graphics::par(no.readonly = TRUE)
par(mfrow = c(2,2), mar = c(4, 4, 1.5, 1))
plot(x1, xlab = "sigma", ylab = "xi", ru_scale = TRUE,
  main = "mode relocation")
plot(x2, xlab = "sigma", ylab = "xi", ru_scale = TRUE,
  main = "mode relocation and rotation")
plot(x3, xlab = "sigma", ylab = "xi", ru_scale = TRUE,
  main = "Box-Cox and mode relocation")
plot(x4, xlab = "sigma", ylab = "xi", ru_scale = TRUE,
  main = "Box-Cox, mode relocation and rotation")
graphics::par(def_par)

[Package rust version 1.4.2 Index]