R: Log-Laplace and Logit-Laplace Distribution Family Functions

loglaplace {VGAM}

R Documentation

Log-Laplace and Logit-Laplace Distribution Family Functions

Description

Maximum likelihood estimation of the 1-parameter log-Laplace and the 1-parameter logit-Laplace distributions. These may be used for quantile regression for counts and proportions respectively.

Usage

loglaplace1(tau = NULL, llocation = "loglink",
    ilocation = NULL, kappa = sqrt(tau/(1 - tau)), Scale.arg = 1,
    ishrinkage = 0.95, parallel.locat = FALSE, digt = 4,
    idf.mu = 3, rep0 = 0.5, minquantile = 0, maxquantile = Inf,
    imethod = 1, zero = NULL)
logitlaplace1(tau = NULL, llocation = "logitlink",
    ilocation = NULL, kappa = sqrt(tau/(1 - tau)),
    Scale.arg = 1, ishrinkage = 0.95, parallel.locat = FALSE,
    digt = 4, idf.mu = 3, rep01 = 0.5, imethod = 1, zero = NULL)

Arguments

`tau`, `kappa`	See `alaplace1`.
`llocation`	Character. Parameter link functions for location parameter `\xi`. See `Links` for more choices. However, this argument should be left unchanged with count data because it restricts the quantiles to be positive. With proportions data `llocation` can be assigned a link such as `logitlink`, `probitlink`, `clogloglink`, etc.
`ilocation`	Optional initial values. If given, it must be numeric and values are recycled to the appropriate length. The default is to choose the value internally.
`parallel.locat`	Logical. Should the quantiles be parallel on the transformed scale (argument `llocation`)? Assigning this argument to `TRUE` circumvents the seriously embarrassing quantile crossing problem.
`imethod`	Initialization method. Either the value 1, 2, or ....
`idf.mu`, `ishrinkage`, `Scale.arg`, `digt`, `zero`	See `alaplace1`. See `CommonVGAMffArguments` for information.
`rep0`, `rep01`	Numeric, positive. Replacement values for 0s and 1s respectively. For count data, values of the response whose value is 0 are replaced by `rep0`; it avoids computing `log(0)`. For proportions data values of the response whose value is 0 or 1 are replaced by `min(rangey01[1]/2, rep01/w[y< = 0])` and `max((1 + rangey01[2])/2, 1-rep01/w[y >= 1])` respectively; e.g., it avoids computing `logitlink(0)` or `logitlink(1)`. Here, `rangey01` is the 2-vector `range(y[(y > 0) & (y < 1)])` of the response.
`minquantile`, `maxquantile`	Numeric. The minimum and maximum values possible in the quantiles. These argument are effectively ignored by default since `loglink` keeps all quantiles positive. However, if `llocation = logofflink(offset = 1)` then it is possible that the fitted quantiles have value 0 because `minquantile = 0`.

Details

These VGAM family functions implement translations of the asymmetric Laplace distribution (ALD). The resulting variants may be suitable for quantile regression for count data or sample proportions. For example, a log link applied to count data is assumed to follow an ALD. Another example is a logit link applied to proportions data so as to follow an ALD. A positive random variable Y is said to have a log-Laplace distribution if Y = e^W where W has an ALD. There are many variants of ALDs and the one used here is described in alaplace1.

Value

An object of class "vglmff" (see vglmff-class). The object is used by modelling functions such as vglm and vgam.

In the extra slot of the fitted object are some list components which are useful. For example, the sample proportion of values which are less than the fitted quantile curves, which is sum(wprior[y <= location]) / sum(wprior) internally. Here, wprior are the prior weights (called ssize below), y is the response and location is a fitted quantile curve. This definition comes about naturally from the transformed ALD data.

Warning

The VGAM family function logitlaplace1 will not handle a vector of just 0s and 1s as the response; it will only work satisfactorily if the number of trials is large.

See alaplace1 for other warnings. Care is needed with tau values which are too small, e.g., for count data the sample proportion of zeros must be less than all values in tau. Similarly, this also holds with logitlaplace1, which also requires all tau values to be less than the sample proportion of ones.

Note

The form of input for logitlaplace1 as response is a vector of proportions (values in [0,1]) and the number of trials is entered into the weights argument of vglm/vgam. See Example 2 below. See alaplace1 for other notes in general.

Author(s)

Thomas W. Yee

References

Kotz, S., Kozubowski, T. J. and Podgorski, K. (2001). The Laplace distribution and generalizations: a revisit with applications to communications, economics, engineering, and finance, Boston: Birkhauser.

Kozubowski, T. J. and Podgorski, K. (2003). Log-Laplace distributions. International Mathematical Journal, 3, 467–495.

Yee, T. W. (2020). Quantile regression for counts and proportions. In preparation.

Examples

# Example 1: quantile regression of counts with regression splines
set.seed(123); my.k <- exp(0)
adata <- data.frame(x2 = sort(runif(n <- 500)))
mymu <- function(x) exp( 1 + 3*sin(2*x) / (x+0.5)^2)
adata <- transform(adata, y = rnbinom(n, mu = mymu(x2), my.k))
mytau <- c(0.1, 0.25, 0.5, 0.75, 0.9); mydof = 3
# halfstepping is usual:
fitp <- vglm(y ~ sm.bs(x2, df = mydof), data = adata, trace = TRUE,
            loglaplace1(tau = mytau, parallel.locat = TRUE))

## Not run:  par(las = 1)  # Plot on a log1p() scale
mylwd <- 1.5

plot(jitter(log1p(y), factor = 1.5) ~ x2, adata, col = "red",
     pch = "o", cex = 0.75,
     main = "Example 1; green=truth, blue=estimated")
with(adata, matlines(x2, log1p(fitted(fitp)), col = "blue",
                     lty = 1, lwd = mylwd))
finexgrid <- seq(0, 1, len = 201)
for (ii in 1:length(mytau))
  lines(finexgrid, col = "green", lwd = mylwd,
        log1p(qnbinom(mytau[ii], mu = mymu(finexgrid), my.k)))

## End(Not run)
fitp@extra  # Contains useful information


# Example 2: sample proportions
set.seed(123); nnn <- 1000; ssize <- 100  # ssize = 1 wont work!
adata <- data.frame(x2 = sort(runif(nnn)))
mymu <- function(x) logitlink( 1.0 + 4*x, inv = TRUE)
adata <- transform(adata, ssize = ssize,
                   y2 = rbinom(nnn, ssize, prob = mymu(x2)) / ssize)

mytau <- c(0.25, 0.50, 0.75)
fit1 <- vglm(y2 ~ sm.bs(x2, df = 3),
        logitlaplace1(tau = mytau, lloc = "clogloglink", paral = TRUE),
        data = adata, weights = ssize, trace = TRUE)

## Not run: 
# Check the solution.  Note: this is like comparing apples with oranges.
plotvgam(fit1, se = TRUE, scol = "red", lcol = "blue",
         main = "Truth = 'green'")
# Centered approximately !
linkFunctionChar <- as.character(fit1@misc$link)
adata <- transform(adata, trueFunction =
           theta2eta(theta = mymu(x2), link = linkFunctionChar))
with(adata, lines(x2, trueFunction - mean(trueFunction), col = "green"))

# Plot the data + fitted quantiles (on the original scale)
myylim <- with(adata, range(y2))
plot(y2 ~ x2, adata, col = "blue", ylim = myylim, las = 1,
     pch = ".", cex = 2.5)
with(adata, matplot(x2, fitted(fit1), add = TRUE, lwd = 3, type = "l"))
truecol <- rep(1:3, len = fit1@misc$M)  # Add the 'truth'
smallxgrid <- seq(0, 1, len = 501)
for (ii in 1:length(mytau))
  lines(smallxgrid, col = truecol[ii], lwd = 2,
        qbinom(mytau[ii], pr = mymu(smallxgrid), si = ssize) / ssize)

# Plot on the eta (== logitlink()/probitlink()/...) scale
  with(adata, matplot(x2, predict(fit1), lwd = 3, type = "l"))
# Add the 'truth'
for (ii in 1:length(mytau)) {
  true.quant <- qbinom(mytau[ii], prob = mymu(smallxgrid),
                       size = ssize) / ssize
  lines(smallxgrid, theta2eta(true.quant, link = linkFunctionChar),
        col = truecol[ii], lwd = 2)
} 
## End(Not run)

[Package VGAM version 1.1-11 Index]