Tuning the number of PCs in the PCR with compositional data using the alpha-transformation {Compositional} | R Documentation |
Tuning the number of PCs in the PCR with compositional data using the \alpha
-transformation
Description
This is a cross-validation procedure to decide on the number of principal components when using regression with compositional data (as predictor variables) using the \alpha
-transformation.
Usage
alfapcr.tune(y, x, model = "gaussian", nfolds = 10, maxk = 50, a = seq(-1, 1, by = 0.1),
folds = NULL, ncores = 1, graph = TRUE, col.nu = 15, seed = NULL)
Arguments
y |
A vector with either continuous, binary or count data. |
x |
A matrix with the predictor variables, the compositional data. Zero values are allowed. |
model |
The type of regression model to fit. The possible values are "gaussian", "binomial" and "poisson". |
nfolds |
The number of folds for the K-fold cross validation, set to 10 by default. |
maxk |
The maximum number of principal components to check. |
a |
A vector with a grid of values of the power transformation, it has to be between -1 and 1. If zero values are present it has to be greater than 0. If |
folds |
If you have the list with the folds supply it here. You can also leave it NULL and it will create folds. |
ncores |
How many cores to use. If you have heavy computations or do not want to wait for long time more than 1 core (if available) is suggested. It is advisable to use it if you have many observations and or many variables, otherwise it will slow down th process. |
graph |
If graph is TRUE (default value) a filled contour plot will appear. |
col.nu |
A number parameter for the filled contour plot, taken into account only if graph is TRUE. |
seed |
You can specify your own seed number here or leave it NULL. |
Details
The \alpha
-transformation is applied to the compositional data first and the function "pcr.tune" or "glmpcr.tune" is called.
Value
If graph is TRUE a filled contour will appear. A list including:
mspe |
The MSPE where rows correspond to the |
best.par |
The best pair of |
performance |
The minimum mean squared error of prediction. |
runtime |
The time required by the cross-validation procedure. |
Author(s)
Michail Tsagris.
R implementation and documentation: Michail Tsagris mtsagris@uoc.gr.
References
Tsagris M. (2015). Regression analysis with compositional data containing zero values. Chilean Journal of Statistics, 6(2): 47-57. https://arxiv.org/pdf/1508.01913v1.pdf
Tsagris M.T., Preston S. and Wood A.T.A. (2011). A data-based power transformation for compositional data. In Proceedings of the 4th Compositional Data Analysis Workshop, Girona, Spain. https://arxiv.org/pdf/1106.1451.pdf
Jolliffe I.T. (2002). Principal Component Analysis.
See Also
alfa, profile, alfa.pcr, pcr.tune, glmpcr.tune, glm
Examples
library(MASS)
y <- as.vector(fgl[, 1])
x <- as.matrix(fgl[, 2:9])
x <- x/ rowSums(x)
mod <- alfapcr.tune(y, x, nfolds = 10, maxk = 50, a = seq(-1, 1, by = 0.1) )