sCCA {sRDA} | R Documentation |
Sparse Canonical Correlation analysis
Description
Sparse Canonical Correlation analysis for high dimensional (biomedical) data. The function takes two datasets and returns a linear combination of maximally correlated canonical correlate pairs. Elastic net penalization (with its variants, UST, Ridge and Lasso penalization) is implemented for sparsity and smoothnesswith a built in cross validation procedure to obtain the optimal penalization parameters. It is possible to obtain multiple canonical variate pairs that are orthogonal to each other.
Usage
sCCA(predictor, predicted, penalization = "enet", ridge_penalty = 1,
nonzero = 1, max_iterations = 100, tolerance = 1 * 10^-20,
cross_validate = FALSE, parallel_CV = TRUE, nr_subsets = 10,
multiple_LV = FALSE, nr_LVs = 1)
Arguments
predictor |
The n*p matrix of the predictor data set |
predicted |
The n*q matrix of the predicted data set |
penalization |
The penalization method applied during the analysis (none, enet or ust) |
ridge_penalty |
The ridge penalty parameter of the predictor set's latent variable used for enet or ust (an integer if cross_validate = FALE, a list otherwise) |
nonzero |
The number of non-zero weights of the predictor set's latent variable (an integer if cross_validate = FALE, a list otherwise) |
max_iterations |
The maximum number of iterations of the algorithm |
tolerance |
Convergence criteria |
cross_validate |
K-fold cross validation to find best optimal penalty parameters (TRUE or FALSE) |
parallel_CV |
Run the cross validation parallel (TRUE or FALSE) |
nr_subsets |
Number of subsets for k-fold cross validation |
multiple_LV |
Obtain multiple latent variable pairs (TRUE or FALSE) |
nr_LVs |
Number of latent variables to be obtained |
Value
An object of class "sRDA"
.
XI |
Predictor set's latent variable scores |
ETA |
Predictive set's latent variable scores |
ALPHA |
Weights of the predictor set's latent variable |
BETA |
Weights of the predicted set's latent variable |
nr_iterations |
Number of iterations ran before convergence (or max number of iterations) |
SOLVE_XIXI |
Inverse of the predictor set's latent variable variance matrix |
iterations_crts |
The convergence criterion value (a small positive tolerance) |
sum_absolute_betas |
Sum of the absolute values of beta weights |
ridge_penalty |
The ridge penalty parameter used for the model |
nr_nonzeros |
The number of nonzero alpha weights in the model |
nr_latent_variables |
The number of latient variable pairs in the model |
CV_results |
The detailed results of cross validations (if cross_validate is TRUE) |
Author(s)
Attila Csala
Examples
# generate data with few highly correlated variahbles
dataXY <- generate_data(nr_LVs = 2,
n = 250,
nr_correlated_Xs = c(5,20),
nr_uncorrelated_Xs = 250,
mean_reg_weights_assoc_X =
c(0.9,0.5),
sd_reg_weights_assoc_X =
c(0.05, 0.05),
Xnoise_min = -0.3,
Xnoise_max = 0.3,
nr_correlated_Ys = c(10,15),
nr_uncorrelated_Ys = 350,
mean_reg_weights_assoc_Y =
c(0.9,0.6),
sd_reg_weights_assoc_Y =
c(0.05, 0.05),
Ynoise_min = -0.3,
Ynoise_max = 0.3)
# seperate predictor and predicted sets
X <- dataXY$X
Y <- dataXY$Y
# run sRDA
CCA.res <- sCCA(predictor = X, predicted = Y, nonzero = 5,
ridge_penalty = 1, penalization = "ust")
# check first 10 weights of X
CCA.res$ALPHA[1:10]
## Not run:
# run sRDA with cross-validation to determine best penalization parameters
CCA.res <- sCCA(predictor = X, predicted = Y, nonzero = c(5,10,15),
ridge_penalty = c(0.1,1), penalization = "enet", cross_validate = TRUE,
parallel_CV = TRUE)
# check first 10 weights of X
CCA.res$ALPHA[1:10]
CCA.res$ridge_penalty
CCA.res$nr_nonzeros
# obtain multiple latent variables
CCA.res <- sCCA(predictor = X, predicted = Y, nonzero = c(5,10,15),
ridge_penalty = c(0.1,1), penalization = "enet", cross_validate = TRUE,
parallel_CV = TRUE, multiple_LV = TRUE, nr_LVs = 2, max_iterations = 5)
# check first 10 weights of X in first two component
CCA.res$ALPHA[[1]][1:10]
CCA.res$ALPHA[[2]][1:10]
# latent variables are orthogonal to each other
t(CCA.res$XI[[1]]) %*% CCA.res$XI[[2]]
## End(Not run)