regcor {FMradio} | R Documentation |
Regularized correlation matrix estimation
Description
regcor
is a function that determines the optimal penalty value and, subsequently, the optimal Ledoit-Wolf type regularized correlation matrix using K-fold cross validation of the negative log-likelihood.
Usage
regcor(X, fold = 5, verbose = TRUE)
Arguments
X |
A (possibly centered and scaled and possibly subsetted) data |
fold |
A |
verbose |
A |
Details
This function estimates a Ledoit-Wolf-type (Ledoit & Wolf, 2004) regularized correlation matrix. The optimal penalty-value is determined internally by K-fold cross-validation of the of the negative log-likelihood function. The procedure is efficient as it makes use of the Brent root-finding procedure (Brent, 1971). The value at which the K-fold cross-validated negative log-likelihood score is minimized is deemed optimal. The function employs the Brent algorithm as implemented in the optim function. It outputs the optimal value for the penalty parameter and the regularized correlation matrix under this optimal penalty value. See Peeters et al. (2019) for further details.
The optimal penalty-value can be used to assess the conditioning of the estimated regularized correlation matrix using, for example, a condition number plot (Peeters, van de Wiel, van Wieringen, 2016).
The regularized correlation matrix under the optimal penalty can serve as the input to functions that assess factorability (SA
), evaluate optimal choices of the latent common factor dimensionality (e.g., dimGB
), and perform maximum likelihood factor analysis (mlFA
).
Value
The function returns an object of class list
:
$optPen |
A |
$optCor |
A |
Note
Note that, for argument X
, the observations are expected to be in the rows and the features are expected to be in the columns.
Author(s)
Carel F.W. Peeters <cf.peeters@vumc.nl>
References
Brent, R.P. (1971). An Algorithm with Guaranteed Convergence for Finding a Zero of a Function. Computer Journal 14: 422–425.
Ledoit, O, & Wolf, M. (2004). A well-conditioned estimator for large-dimensional covariance matrices. Journal of Multivariate Analysis, 88:365–411.
Peeters, C.F.W. et al. (2019). Stable prediction with radiomics data. arXiv:1903.11696 [stat.ML].
Peeters, C.F.W., van de Wiel, M.A., & van Wieringen, W.N. (2016). The spectral condition number plot for regularization parameter determination, arXiv:1608.04123v1 [stat.CO].
See Also
Examples
## Generate some (high-dimensional) data
## Get correlation matrix
p = 25
n = 10
set.seed(333)
X = matrix(rnorm(n*p), nrow = n, ncol = p)
colnames(X)[1:25] = letters[1:25]
R <- cor(X)
## Redundancy visualization, at threshold value .9
radioHeat(R, diag = FALSE, threshold = TRUE, threshvalue = .9)
## Redundancy-filtering of correlation matrix
Rfilter <- RF(R, t = .9)
dim(Rfilter)
## Subsetting data
DataSubset <- subSet(X, Rfilter)
dim(DataSubset)
## Obtain regularized correlation matrix
RegR <- regcor(DataSubset, fold = 5, verbose = TRUE)
RegR$optPen ## optimal penalty-value