| flash {flashier} | R Documentation |
Empirical Bayes matrix factorization
Description
Fits an empirical Bayes matrix factorization (see Details for a
description of the model). The resulting fit is referred to as a "flash"
object (short for Factors and Loadings using Adaptive SHrinkage). Two
interfaces are provided. The flash function provides a simple
interface that allows a flash object to be fit in a single pass, while
flash_xxx functions are pipeable functions that allow for more
complex flash objects to be fit incrementally (available functions are
listed below under See Also). See the vignettes and
Examples for usage.
Usage
flash(
data,
S = NULL,
ebnm_fn = ebnm_point_normal,
var_type = 0L,
greedy_Kmax = 50L,
backfit = FALSE,
nullcheck = TRUE,
verbose = 1L
)
Arguments
data |
The observations. Usually a matrix, but can also be a sparse
matrix of class |
S |
The standard errors. Can be |
ebnm_fn |
The function or functions used to solve the empirical Bayes
normal means (EBNM) subproblems. Most importantly, these functions specify
the families of distributions Any EBNM function provided by package |
var_type |
Describes the structure of the estimated residual variance.
Can be Note that if any portion of the residual variance is to be estimated, then
it is usually faster to set |
greedy_Kmax |
The maximum number of factors to be added. This will not
necessarily be the total number of factors added by |
backfit |
A "greedy" fit is performed by adding up to
|
nullcheck |
If |
verbose |
When and how to display progress updates. Set to
|
Details
If Y is an n \times p data matrix, then the rank-one
empirical Bayes matrix factorization model is:
Y = \ell f' + E,
where \ell is an
n-vector of loadings, f is a
p-vector of factors, and E is an
n \times p matrix of residuals (or "errors").
Additionally:
e_{ij} \sim N(0, s_{ij}^2): i = 1, ..., n; j = 1, ..., p
\ell \sim g_\ell \in G_\ell
f \sim g_f \in G_f.
The residual variance parameters s_{ij}^2 are constrained to have
a simple structure and are fit via maximum likelihood. (For example, one
might assume that all standard errors are identical: s_{ij}^2 = s^2
for some s^2 and for all i, j).
The functions g_\ell and g_f are assumed to belong to
some families of priors G_\ell and G_f that are
specified in advance, and are estimated via variational approximation.
The general rank-K empirical Bayes matrix factorization model is:
Y = LF' + E
or
y_{ij} = \sum_k \ell_{ik} f_{jk} + e_{ij}: i = 1, ..., n; j = 1, ..., p,
where L is now a matrix of loadings and F is a matrix of
factors.
Separate priors g_\ell^{(k)} and g_f^{(k)} are estimated via
empirical Bayes, and different prior families may be used for different
values of k. In general, then:
e_{ij} \sim N(0, s_{ij}^2): i = 1, ..., n; j = 1, ..., p
\ell_{ik} \sim g_\ell^{(k)} \in G_\ell^{(k)}: i = 1, ..., n; k = 1, ..., K
f_{ik} \sim g_f^{(k)} \in G_f^{(k)}: j = 1, ..., p; k = 1, ..., K.
Typically, G_\ell^{(k)} and G_f^{(k)} will be closed under
scaling, in which case \ell_k and f_k are only identifiable
up to a scaling factor d_k. In other words, we can write:
Y = LDF' + E,
where D is a diagonal matrix with diagonal entries d_1, ..., d_K.
The model can then be made identifiable by constraining the scale of
\ell_k and f_k for k = 1, ..., K.
Value
A flash object. Contains elements:
n_factorsThe total number of factor/loadings pairs
Kin the fitted model.pveThe proportion of variance explained by each factor/loadings pair. Since factors and loadings are not required to be orthogonal, this should be interpreted loosely: for example, the total proportion of variance explained could be larger than 1.
elboThe variational lower bound achieved by the fitted model.
residuals_sdEstimated residual standard deviations (these include any variance component given as an argument to
S).L_pm, L_psd, L_lfsrPosterior means, standard deviations, and local false sign rates for loadings
L.F_pm, F_psd, F_lfsrPosterior means, standard deviations, and local false sign rates for factors
F.L_ghatThe fitted priors on loadings
\hat{g}_\ell^{(k)}.F_ghatThe fitted priors on factors
\hat{g}_f^{(k)}.samplerA function that takes a single argument
nsampand returnsnsampsamples from the posterior distributions for factorsFand loadingsL.flash_fitA
flash_fitobject. Used byflashwhen fitting is not performed all at once, but incrementally via calls to variousflash_xxxfunctions.
The following methods are available:
fitted.flashReturns the "fitted values"
E(LF') = E(L) E(F)'.residuals.flashReturns the expected residuals
Y - E(LF') = Y - E(L) E(F)'.ldf.flashReturns an
LDFdecomposition (see Details above), with columns ofLandFscaled as specified by the user.
References
Wei Wang and Matthew Stephens (2021). "Empirical Bayes matrix factorization." Journal of Machine Learning Research 22, 1–40.
See Also
flash_init, flash_greedy,
flash_backfit, and flash_nullcheck. For more
advanced functionality, see flash_factors_init,
flash_factors_fix, flash_factors_set_to_zero,
flash_factors_remove, flash_set_verbose, and
flash_set_conv_crit.
For extracting useful data from flash objects, see
fitted.flash, residuals.flash, and
ldf.flash.
Examples
data(gtex)
# Fit up to 3 factors and backfit.
fl <- flash(gtex, greedy_Kmax = 3L, backfit = TRUE)
# This is equivalent to the series of calls:
fl <- flash_init(gtex) %>%
flash_greedy(Kmax = 3L) %>%
flash_backfit() %>%
flash_nullcheck()
# Fit a unimodal distribution with mean zero to each set of loadings
# and a scale mixture of normals with mean zero to each factor.
fl <- flash(gtex,
ebnm_fn = c(ebnm_unimodal,
ebnm_normal_scale_mixture),
greedy_Kmax = 3)
# Fit point-laplace priors using a non-default optimization method.
fl <- flash(gtex,
ebnm_fn = flash_ebnm(prior_family = "point_laplace",
optmethod = "trust"),
greedy_Kmax = 3)
# Fit a "Kronecker" (rank-one) variance structure (this can be slow).
fl <- flash(gtex, var_type = c(1, 2), greedy_Kmax = 3L)