EsaBcv {esaBcv} | R Documentation |
Estimate Latent Factor Matrix
Description
Find out the best number of factors using Bi-Cross-Validation (BCV) with Early-Stopping-Alternation (ESA) and then estimate the factor matrix.
Usage
EsaBcv(Y, X = NULL, r.limit = 20, niter = 3, nRepeat = 12, only.r = F,
svd.method = "fast", center = F)
Arguments
Y |
observed data matrix. p is the number of variables and
n is the sample size. Dimension is |
X |
the known predictors of size |
r.limit |
the maximum number of factor to try. Default is 20. Can be set to Inf. |
niter |
the number of iterations for ESA. Default is 3. |
nRepeat |
number of repeats of BCV. In other words, the random partition of |
only.r |
whether only to estimate and return the number of factors. |
svd.method |
either "fast", "propack" or "standard".
"fast" is using the |
center |
logical, whether to add an intercept term in the model. Default is False. |
Details
The model is
Y = 1 \mu' + X \beta + n^{1/2}U D V' + E \Sigma^{1/2}
where D
and \Sigma
are diagonal matrices, U
and V
are orthogonal and mu'
and V'
represent _mu transposed_ and _V transposed_ respectively.
The entries of E
are assumed to be i.i.d. standard Gaussian.
The model assumes heteroscedastic noises and especially works well for
high-dimensional data. The method is based on Owen and Wang (2015). Notice that
when nonnull X
is given or centering the data is required (which is essentially
adding a known covariate with all 1
), for identifiability, it's required that
<X, U> = 0
or <1, U> = 0
respectively. Then the method will first make a rotation
of the data matrix to remove the known predictors or centers, and then use
the latter n - k
(or n - k - 1
if centering is required) samples to
estimate the latent factors. The rotation idea first appears in Sun et.al. (2012).
Value
EsaBcv
returns an obejct of class
"esabcv"
The function plot
plots the cross-validation results and points out the
number of factors estimated
An object of class "esabcv" is a list containing the following components:
best.r |
the best number of factor estimated |
estSigma |
the diagonal entries of estimated |
estU |
the estimated |
estD |
the estimated diagonal entries of |
estV |
the estimated |
beta |
the estimated |
estS |
the estimated signal(factor) matrix
|
mu |
the sample centers of each variable which is a vector of length
|
max.r |
the actual maximum number of factors used. For the details of how this is decided, please refer to Owen and Wang (2015) |
result.list |
a matrix with dimension |
References
Art B. Owen and Jingshu Wang(2015), Bi-cross-validation for factor analysis, http://arxiv.org/abs/1503.03515
Yunting Sun, Nancy R. Zhang and Art B. Owen, Multiple hypothesis testing adjusted for latent variables, with an application to the AGEMAP gene expression data. The Annuals of Applied Statistics, 6(4): 1664-1688, 2012
See Also
Examples
Y <- matrix(rnorm(100), nrow = 10)
EsaBcv(Y)