| sCCA {sRDA} | R Documentation | 
Sparse Canonical Correlation analysis
Description
Sparse Canonical Correlation analysis for high dimensional (biomedical) data. The function takes two datasets and returns a linear combination of maximally correlated canonical correlate pairs. Elastic net penalization (with its variants, UST, Ridge and Lasso penalization) is implemented for sparsity and smoothnesswith a built in cross validation procedure to obtain the optimal penalization parameters. It is possible to obtain multiple canonical variate pairs that are orthogonal to each other.
Usage
sCCA(predictor, predicted, penalization = "enet", ridge_penalty = 1,
  nonzero = 1, max_iterations = 100, tolerance = 1 * 10^-20,
  cross_validate = FALSE, parallel_CV = TRUE, nr_subsets = 10,
  multiple_LV = FALSE, nr_LVs = 1)
Arguments
| predictor | The n*p matrix of the predictor data set | 
| predicted | The n*q matrix of the predicted data set | 
| penalization | The penalization method applied during the analysis (none, enet or ust) | 
| ridge_penalty | The ridge penalty parameter of the predictor set's latent variable used for enet or ust (an integer if cross_validate = FALE, a list otherwise) | 
| nonzero | The number of non-zero weights of the predictor set's latent variable (an integer if cross_validate = FALE, a list otherwise) | 
| max_iterations | The maximum number of iterations of the algorithm | 
| tolerance | Convergence criteria | 
| cross_validate | K-fold cross validation to find best optimal penalty parameters (TRUE or FALSE) | 
| parallel_CV | Run the cross validation parallel (TRUE or FALSE) | 
| nr_subsets | Number of subsets for k-fold cross validation | 
| multiple_LV | Obtain multiple latent variable pairs (TRUE or FALSE) | 
| nr_LVs | Number of latent variables to be obtained | 
Value
An object of class "sRDA".
| XI | Predictor set's latent variable scores | 
| ETA | Predictive set's latent variable scores | 
| ALPHA | Weights of the predictor set's latent variable | 
| BETA | Weights of the predicted set's latent variable | 
| nr_iterations | Number of iterations ran before convergence (or max number of iterations) | 
| SOLVE_XIXI | Inverse of the predictor set's latent variable variance matrix | 
| iterations_crts | The convergence criterion value (a small positive tolerance) | 
| sum_absolute_betas | Sum of the absolute values of beta weights | 
| ridge_penalty | The ridge penalty parameter used for the model | 
| nr_nonzeros | The number of nonzero alpha weights in the model | 
| nr_latent_variables | The number of latient variable pairs in the model | 
| CV_results | The detailed results of cross validations (if cross_validate is TRUE) | 
Author(s)
Attila Csala
Examples
# generate data with few highly correlated variahbles
dataXY <- generate_data(nr_LVs = 2,
                           n = 250,
                           nr_correlated_Xs = c(5,20),
                           nr_uncorrelated_Xs = 250,
                           mean_reg_weights_assoc_X =
                             c(0.9,0.5),
                           sd_reg_weights_assoc_X =
                             c(0.05, 0.05),
                           Xnoise_min = -0.3,
                           Xnoise_max = 0.3,
                           nr_correlated_Ys = c(10,15),
                           nr_uncorrelated_Ys = 350,
                           mean_reg_weights_assoc_Y =
                             c(0.9,0.6),
                           sd_reg_weights_assoc_Y =
                             c(0.05, 0.05),
                           Ynoise_min = -0.3,
                           Ynoise_max = 0.3)
# seperate predictor and predicted sets
X <- dataXY$X
Y <- dataXY$Y
# run sRDA
CCA.res <- sCCA(predictor = X, predicted = Y, nonzero = 5,
ridge_penalty = 1, penalization = "ust")
# check first 10 weights of X
CCA.res$ALPHA[1:10]
## Not run: 
# run sRDA with cross-validation to determine best penalization parameters
CCA.res <- sCCA(predictor = X, predicted = Y, nonzero = c(5,10,15),
ridge_penalty = c(0.1,1), penalization = "enet", cross_validate = TRUE,
parallel_CV = TRUE)
# check first 10 weights of X
CCA.res$ALPHA[1:10]
CCA.res$ridge_penalty
CCA.res$nr_nonzeros
# obtain multiple latent variables
CCA.res <- sCCA(predictor = X, predicted = Y, nonzero = c(5,10,15),
ridge_penalty = c(0.1,1), penalization = "enet", cross_validate = TRUE,
parallel_CV = TRUE, multiple_LV = TRUE, nr_LVs = 2, max_iterations = 5)
# check first 10 weights of X in first two component
CCA.res$ALPHA[[1]][1:10]
CCA.res$ALPHA[[2]][1:10]
# latent variables are orthogonal to each other
t(CCA.res$XI[[1]]) %*% CCA.res$XI[[2]]
## End(Not run)