frcc {FRCC} | R Documentation |
This function implements the Fast Regularized Canonical Correlation Analysis
Description
This function implements the Fast Regularized Canonical Correlation algorithm described in [Cruz-Cano et al., 2014].
The main idea of the algorithm is using the minimum risk estimators of the correlation matrices described in [Schafer and Strimmer, 2008] during the calculation of the Canonical correlation Structure.
It can be considered an extension of the work for two set of variables (blocks) mentioned in [Tenenhaus and Tenenhaus, 2011].
Usage
frcc(X, Y)
Arguments
X |
numeric matrix (n by p) which contains the observations on the X variables. |
Y |
numeric matrix (n by q) which contains the observations on the Y variables. |
Value
A list with the following components of the Canonical Structure:
cor |
Canonical correlations. |
p_values |
The corresponding p-values for the each of the canonical correlations. |
canonical_weights_X |
The canonical weights for the variables of the dataset X. |
canonical_weights_Y |
The canonical weights for the variables of the dataset Y. |
canonical_factor_loadings_X |
The inter-set canonical factor loadings for the variables of the dataset X. |
canonical_factor_loadings_Y |
The inter-set canonical factor loadings for the variables of the dataset Y. |
Author(s)
Raul Cruz-Cano
References
Cruz-Cano, R.; Lee, M.L.T.; Fast Regularized Canonical Correlation Analysis, Computational Statistics & Data Analysis, Volume 70, 2014, Pages 88-100, ISSN 0167-9473, https://doi.org/10.1016/j.csda.2013.09.020.
Schafer, J; Strimmer, K. (2005). A shrinkage approach to large-scale covariance matrix estimation and implications for functional genomics. Statistical Applications in Genetics and Molecular Biology 4:14, Article 32.
Tenenhaus, A.; Tenenhaus, M. (2011). Regularized Generalized Canonical Correlation Analysis. Psychometrika 76:2, DOI: 10.1007/S11336-011-9206-8.
Examples
# Example # 1 Multivariate Normal Data
p<-10
q<-10
n<-50
res<-generate_multivariate_normal_sample(p,q,n)
X<-res$X
Y<-res$Y
rownames(X)<-c(1:n)
colnames(X)<-c(1:p)
colnames(Y)<- c(1:q)
my_res<-frcc(X,Y)
print(my_res)
#Example #2 Soil Specification Data
data(soilspec)
list_of_units_to_be_used<-sample(1:nrow(soilspec),14)
X<- soilspec[list_of_units_to_be_used,2:9]
Y<- soilspec[list_of_units_to_be_used,10:15]
colnames(X)<-c("H. pubescens", "P. bertolonii", "T. pretense",
"P. sanguisorba", "R. squarrosus", "H. pilosella", "B. media","T. drucei")
colnames(Y)<- c("d","P","K","d x P", "d x K","P x K")
my_res<-frcc(X,Y)
grDevices::dev.new()
plot_variables(my_res,1,2)
#Example #3 NCI-60 micrRNA Data
data("Topoisomerase_II_Inhibitors")
data("microRNA")
my_res <- frcc(t(microRNA),-1*t(Topoisomerase_II_Inhibitors))
for( i in 1:dim(microRNA)[2])
{
colnames(microRNA)[i]<-substr(colnames(microRNA)[i], 1, 2)
}#end for i