Regression with compositional data using the alpha-transformation {Compositional}R Documentation

Regression with compositional data using the \alpha-transformation

Description

Regression with compositional data using the \alpha-transformation.

Usage

alfa.reg(y, x, a, xnew = NULL, yb = NULL)
alfa.reg2(y, x, a, xnew = NULL)
alfa.reg3(y, x, a = c(-1, 1), xnew = NULL)

Arguments

y

A matrix with the compositional data.

x

A matrix with the continuous predictor variables or a data frame including categorical predictor variables.

a

The value of the power transformation, it has to be between -1 and 1. If zero values are present it has to be greater than 0. If \alpha=0 the isometric log-ratio transformation is applied and the solution exists in a closed form, since it the classical mutivariate regression. For the alfa.reg2() this should be a vector of \alpha values and the function call repeatedly the alfa.reg() function. For the alfa.reg3() function it should be a vector with two values, the endpoints of the interval of \alpha. This function searches for the optimal vaue of \alpha that minimizes the sum of squares of the errors. Using the optimize function it searches for the optimal value of \alpha. Instead of choosing the value of \alpha using alfareg.tune (that uses cross-validation) one can select it this way.

xnew

If you have new data use it, otherwise leave it NULL.

yb

If you have already transformed the data using the \alpha-transformation with the same \alpha as given in the argument "a", put it here. Othewrise leave it NULL.

This is intended to be used in the function alfareg.tune in order to speed up the process. The time difference in that function is small for small samples. But, if you have a few thousands and or a few more components, there will be bigger differences.

Details

The \alpha-transformation is applied to the compositional data first and then multivariate regression is applied. This involves numerical optimisation. The alfa.reg2() function accepts a vector with many values of \alpha, while the the alfa.reg3() function searches for the value of \alpha that minimizes the Kulback-Leibler divergence between the observed and the fitted compositional values. The functions are highly optimized.

Value

For the alfa.reg() function a list including:

runtime

The time required by the regression.

be

The beta coefficients.

seb

The standard error of the beta coefficients.

est

The fitted values for xnew if xnew is not NULL.

For the alfa.reg2() function a list with as many sublists as the number of values of \alpha. Each element (sublist) of the list contains the above outcomes of the alfa.reg() function.

For the alfa.reg3() function a list with all previous elements plus an output "alfa", the optimal value of \alpha.

Author(s)

Michail Tsagris.

R implementation and documentation: Michail Tsagris mtsagris@uoc.gr.

References

Tsagris M. (2015). Regression analysis with compositional data containing zero values. Chilean Journal of Statistics, 6(2): 47-57. https://arxiv.org/pdf/1508.01913v1.pdf

Tsagris M.T., Preston S. and Wood A.T.A. (2011). A data-based power transformation for compositional data. In Proceedings of the 4th Compositional Data Analysis Workshop, Girona, Spain. https://arxiv.org/pdf/1106.1451.pdf

Mardia K.V., Kent J.T., and Bibby J.M. (1979). Multivariate analysis. Academic press.

Aitchison J. (1986). The statistical analysis of compositional data. Chapman & Hall.

See Also

alfareg.tune, diri.reg, js.compreg, kl.compreg, ols.compreg, comp.reg

Examples

library(MASS)
x <- as.vector(fgl[1:40, 1])
y <- as.matrix(fgl[1:40, 2:9])
y <- y / rowSums(y)
mod <- alfa.reg(y, x, 0.2)

[Package Compositional version 6.9 Index]