runUINMF {rliger} | R Documentation |
Perform Mosaic iNMF (UINMF) on scaled datasets with unshared features
Description
Performs mosaic integrative non-negative matrix factorization (UINMF) (A.R.
Kriebel, 2022) using block coordinate descent (alternating non-negative
least squares, ANLS) to return factorized ,
,
and
matrices. The objective function is stated as
where is the input non-negative matrix of the
'th dataset,
is the input non-negative matrix for the unshared features,
is the total number of datasets.
is of size
for
shared features and
cells,
is of size
for
unshared feaetures,
is of size
,
is of size
,
is of size
and
is of
size
.
The factorization produces a shared matrix (genes by k). For each
dataset, an
matrix (k by cells), a
matrix (genes by k) and
a
matrix (unshared genes by k). The
matrices represent the
cell factor loadings.
is held consistent among all datasets, as it
represents the shared components of the metagenes across datasets. The
matrices represent the dataset-specific components of the metagenes,
matrices are similar to
s but represents the loading
contributed by unshared features.
This function adopts highly optimized fast and memory efficient
implementation extended from Planc (Kannan, 2016). Pre-installation of
extension package RcppPlanc
is required. The underlying algorithm
adopts the identical ANLS strategy as optimizeALS(unshared =
TRUE)
in the old version of LIGER.
Usage
runUINMF(object, k = 20, lambda = 5, ...)
## S3 method for class 'liger'
runUINMF(
object,
k = 20,
lambda = 5,
nIteration = 30,
nRandomStarts = 1,
seed = 1,
nCores = 2L,
verbose = getOption("ligerVerbose", TRUE),
...
)
Arguments
object |
liger object. Should run
|
k |
Inner dimension of factorization (number of factors). Generally, a
higher |
lambda |
Regularization parameter. Larger values penalize
dataset-specific effects more strongly (i.e. alignment should increase as
|
... |
Arguments passed to other methods and wrapped functions. |
nIteration |
Total number of block coordinate descent iterations to
perform. Default |
nRandomStarts |
Number of restarts to perform (iNMF objective function
is non-convex, so taking the best objective from multiple successive
initialization is recommended). For easier reproducibility, this increments
the random seed by 1 for each consecutive restart, so future factorization
of the same dataset can be run with one rep if necessary. Default |
seed |
Random seed to allow reproducible results. Default |
nCores |
The number of parallel tasks to speed up the computation.
Default |
verbose |
Logical. Whether to show information of the progress. Default
|
Value
liger method - Returns updated input liger object.
A list of all
matrices can be accessed with
getMatrix(object, "H")
A list of all
matrices can be accessed with
getMatrix(object, "V")
The
matrix can be accessed with
getMatrix(object, "W")
A list of all
matrices can be accessed with
getMatrix(object, "U")
Note
Currently, Seurat S3 method is not supported for UINMF because there is no simple solution for organizing a number of miscellaneous matrices with a single Seurat object. We strongly recommend that users create a liger object which has the specific structure.
References
April R. Kriebel and Joshua D. Welch, UINMF performs mosaic integration of single-cell multi-omic datasets using nonnegative matrix factorization, Nat. Comm., 2022
Examples
pbmc <- normalize(pbmc)
pbmc <- selectGenes(pbmc, useUnsharedDatasets = c("ctrl", "stim"))
pbmc <- scaleNotCenter(pbmc)
if (!is.null(getMatrix(pbmc, "scaleUnsharedData", "ctrl")) &&
!is.null(getMatrix(pbmc, "scaleUnsharedData", "stim"))) {
# TODO: unshared variable features cannot be detected from this example
pbmc <- runUINMF(pbmc)
}