R: Estimating potential regeneration densities by quantile...

quax {quaxnat}

R Documentation

Estimating potential regeneration densities by quantile regression

Description

quax estimates parameters of a spatial dispersal kernel that describes the regeneration potential as the \tau th quantile of the regeneration density. Here \tau is between 0 and 1, with typical values close to 1 representing the situation that the full regeneration potential is realized only at a small fraction of all sites.

Usage

quax(...)

## Default S3 method:
quax(
  ...,
  y,
  tau,
  fun = k_lognormal,
  weights = 1,
  dim = 2,
  par = c(log.a = 8, log.b = 1)
)

## S3 method for class 'formula'
quax(
  formula,
  data,
  tau,
  fun = k_lognormal,
  subset,
  weights,
  na.action,
  offset,
  ...
)

Arguments

`...`	Vector of positions `x_{1},...,x_{n}` or distances to the seed source as required by the specific dispersal kernel. Optionally, further arguments passed to `optim`, to the default method or to the kernel.
`y`	Vector of observed values `y_{1},...,y_{n}` of the regeneration density of the inventory plot.
`tau`	Numeric value between 0 and 1. Specifies the quantile `\tau` used in the regression.
`fun`	Function representing the dispersal kernel `k_{\theta }`, multiplied by `N`, that is assumed for the regeneration potential. Values allowed are `k_lognormal`, `k_t`, `k_power`, `k_weibull`, `k_exponential_power` or a custom function with nonnegative values whose parameters include, in addition to the arguments in `...` not consumed by `optim` or the default method, the scaling factor `N` and the spatial dimension `d` (see Examples). The default, `k_lognormal`, is to fit a model with log-normal distance distributions.
`weights`	Numeric vector of optional non-negative weights `w_i` of the observations in the estimation procedure. Default is 1.
`dim`	The spatial dimension, by default equal to 2.
`par`	Numeric vector of initial values for the parameter vector `\theta` .
`formula`	A formula of the form `y ~ x`.
`data`, `subset`, `na.action`, `offset`	For the formula interface: Further arguments passed to `model.frame` (along with `weights`).

Details

The function estimates the parameters N and \theta of the regeneration potential Nk_{\theta } by minimizing

\displaystyle \sum _{i=1}^{n}w_{i}\rho _{\tau }(y_{i}-Nk_{\theta } (x_{i})),

where \rho _{\tau }(u)=\int _{0}^{u}\tau -\mathbf{1}_{s<0}\;ds=\bigl\{ \begin{smallmatrix}u\tau & \text{if }u\geq 0\\u(\tau -1)&\text{if }u<0 \end{smallmatrix} (Koenker and Bassett 1978, Chapter 6.6 in Koenker 2005). The preceding line, after subtracting the same expression for N=0 and substituting s=y_{i}-tk_{\theta }(x_{i}) in the integral, becomes \int _{0}^{N}\sum _{i=1}^{n}w_{i}k_{\theta }(x_{i})( \mathbf{1}_{y_{i}<tk_{\theta }(x_{i})}-\tau )\;dt, and any N such that the last integrand is \leq 0 for t<N and \geq 0 for t>N, which can always be found as the integrand is increasing in t, minimizes this integral. The integrand being the difference of the sum of w_{i}k_{\theta }(x_{i}) over the i with y_{i}<tk_{\theta }(x_{i}) and \tau times the sum over all i, with relevant terms for nonzero k_{\theta }(x_{i}), this means that the estimate of N for a given vector \theta can be computed as a \tau th quantile. This is implemented as an inner, nested minimization, the result of which is minimized in \theta using optim.

This is a rather naive approach to quantile regression that appears to work reasonably well for scaled dispersal kernels Nk_{\theta } as considered here, see Appendix A in Axer et al. (2021). For general quantile regression problems the more sophisticated procedure nlrq in the package quantreg, based on Koenker and Park (1996), is expected to provide better results.

In particular, quax is subject to the usual numerical issues inherent in optimization: It can get stuck in a local minimum or altogether miss a minimum if the initial values (as specified by the argument par) are too far off or if the objective function exhibits bad behavior. Problems can further arise in the dispersal kernels if parameter values passed on a log scale become too large. It is therefore recommended to visually check the results (see Examples). Also, the optim arguments method and control can be added in ... to select and tune the optimization algorithm, but note that the objective function is usually not differentiable.

See Koenker (2005) for a detailed exposition of quantile regression.

Value

An objcet of class quax containing the estimated function, including an attribute o containing the results of optim. Generic functions with methods defined for quax objects invoke these methods; see summary.quax for an example.

References

Koenker, R., Bassett, G. (1978). Regression quantiles. Econometrica 46(1), 33–50. doi:10.2307/1913643

Axer, M., Schlicht, R., Wagner, S. (2021). Modelling potential density of natural regeneration of European oak species (Quercus robur L., Quercus petraea (Matt.) Liebl.) depending on the distance to the potential seed source: Methodological approach for modelling dispersal from inventory data at forest enterprise level. Forest Ecology and Management 482, 118802. doi:10.1016/j.foreco.2020.118802

Koenker, R., Park, B.J. (1996). An interior point algorithm for nonlinear quantile regression. Journal of Econometrics 71(1–2), 265–283. doi:10.1016/0304-4076(96)84507-6

Koenker, R. (2005). Quantile regression. Cambridge University Press. doi:10.1017/CBO9780511754098

Examples

## Prepare artificial data:
set.seed(0)
r <- rgamma(200, shape=2, scale=150)
simulated.data <- data.frame(distance = r, density =
  rpois(length(r), k_lognormal(r, par=c(6,0), N=1000000, d=2)))
plot(density ~ distance, simulated.data)

## Run quax function:
f1 <- quax(x = simulated.data$distance, y = simulated.data$density,
  tau = 0.9, fun = k_lognormal)
summary(f1)
curve(f1(x), add=TRUE)

## Do the same using formula interface:
f1 <- quax(density ~ distance, simulated.data,
  tau = 0.9, fun = k_lognormal)
summary(f1)
#quantreg::nlrq(density ~ k_lognormal(distance,c(log.a,log.b),N=N,d=2),
#  simulated.data, start = c(log.a=6,log.b=0,N=1e6), tau = 0.9)  # similar

## Use another quantile:
f2 <- quax(density ~ distance, simulated.data,
  tau = 0.99, fun = k_lognormal)
summary(f2)
curve(f2(x), add=TRUE, lwd=0)

## Show effect of weights:
f3 <- quax(density ~ distance, simulated.data,
  tau = 0.9, fun = k_lognormal, weights = distance)
summary(f3)
curve(f3(x), add=TRUE, lty=3)

## Compare various dispersal models:
fun <- c("k_lognormal","k_t","k_weibull","k_power","k_exponential_power")
for (i in seq_along(fun))
  curve(quax(density ~ distance, simulated.data,
    tau = 0.9, fun = get(fun[i]), weights = distance)(x),
    add=TRUE, col=i, lty=3)
legend("topright", fun, col=seq_along(fun), lty=3)

## Use positions in computation:
simulated.data$position <- r *
  (\(a) cbind(cos(a),sin(a))) (rnorm(length(r)))
f3 <- quax(density ~ position, simulated.data,
  tau = 0.9, fun = k_lognormal, weights = distance)
summary(f3)

## Show problems with bad initial values and try another parameterization:
curve(quax(density ~ distance, simulated.data, par = c(log.a=0,log.b=0),
  tau = 0.99, fun = k_lognormal)(x), add=TRUE, lty=2)
curve(quax(density ~ distance, simulated.data, par = c(a=1,b=1),
  tau = 0.99, fun = function(x,par,N,d) if (any(par<=0)) rep(NA,NROW(x))
    else k_lognormal(x,log(par),N,d))(x), add=TRUE, lty=2)

## Use custom variant of lognormal model that includes a shift:
plot(simulated.data$position)
f4 <- quax(density ~ position, simulated.data,
  tau = 0.9, par = c(8, 1, 0, 0),
  fun = function(x, par, N, d)
    k_lognormal(x - rep(par[-(1:2)],each=NROW(x)), par[1:2], N, d)
)
summary(f4)

[Package quaxnat version 1.0.0 Index]