R: Divergence based regression for compositional data

Divergence based regression for compositional data {Compositional}

R Documentation

Divergence based regression for compositional data

Description

Regression for compositional data based on the Kullback-Leibler the Jensen-Shannon divergence and the symmetric Kullback-Leibler divergence.

Usage

kl.compreg(y, x, con = TRUE, B = 1, ncores = 1, xnew = NULL, tol = 1e-07, maxiters = 50)
js.compreg(y, x, con = TRUE, B = 1, ncores = 1, xnew = NULL)
tv.compreg(y, x, con = TRUE, B = 1, ncores = 1, xnew = NULL)
symkl.compreg(y, x, con = TRUE, B = 1, ncores = 1, xnew = NULL)

Arguments

`y`	A matrix with the compositional data (dependent variable). Zero values are allowed.
`x`	The predictor variable(s), they can be either continnuous or categorical or both.
`con`	If this is TRUE (default) then the constant term is estimated, otherwise the model includes no constant term.
`B`	If B is greater than 1 bootstrap estimates of the standard error are returned. If B=1, no standard errors are returned.
`ncores`	If ncores is 2 or more parallel computing is performed. This is to be used for the case of bootstrap. If B=1, this is not taken into consideration.
`xnew`	If you have new data use it, otherwise leave it NULL.
`tol`	The tolerance value to terminate the Newton-Raphson procedure.
`maxiters`	The maximum number of Newton-Raphson iterations.

Details

In the kl.compreg() the Kullback-Leibler divergence is adopted as the objective function. In case of problematic convergence the "multinom" function by the "nnet" package is employed. This will obviously be slower. The js.compreg() uses the Jensen-Shannon divergence and the symkl.compreg() uses the symmetric Kullback-Leibler divergence. The tv.compreg() uses the Total Variation divergence. There is no actual log-likelihood for the last three regression models.

Value

A list including:

`runtime`	The time required by the regression.
`iters`	The number of iterations required by the Newton-Raphson in the kl.compreg function.
`loglik`	The log-likelihood. This is actually a quasi multinomial regression. This is bascially half the negative deviance, or `- \sum_{i=1}^ny_i\log{y_i/\hat{y}_i}`.
`be`	The beta coefficients.
`covbe`	The covariance matrix of the beta coefficients, if bootstrap is chosen, i.e. if B > 1.
`est`	The fitted values of xnew if xnew is not NULL.

Author(s)

Michail Tsagris.

R implementation and documentation: Michail Tsagris mtsagris@uoc.gr and Giorgos Athineou <gioathineou@gmail.com>.

References

Murteira, Jose MR, and Joaquim JS Ramalho 2016. Regression analysis of multivariate fractional data. Econometric Reviews 35(4): 515-552.

Tsagris, Michail (2015). A novel, divergence based, regression for compositional data. Proceedings of the 28th Panhellenic Statistics Conference, 15-18/4/2015, Athens, Greece. https://arxiv.org/pdf/1511.07600.pdf

Endres, D. M. and Schindelin, J. E. (2003). A new metric for probability distributions. Information Theory, IEEE Transactions on 49, 1858-1860.

Osterreicher, F. and Vajda, I. (2003). A new class of metric divergences on probability spaces and its applicability in statistics. Annals of the Institute of Statistical Mathematics 55, 639-653.

Examples

library(MASS)
x <- as.vector(fgl[, 1])
y <- as.matrix(fgl[, 2:9])
y <- y / rowSums(y)
mod1<- kl.compreg(y, x, B = 1, ncores = 1)
mod2 <- js.compreg(y, x, B = 1, ncores = 1)

[Package Compositional version 6.9 Index]