KSoftImpute {SignacX} | R Documentation |
KNN-based imputation
Description
KSoftImpute
is an ultra-fast method for imputing missing gene expression values in single cell data.
KSoftImpute
uses k-nearest neighbors to impute the expression of each gene by the weighted average of itself
and it's first-degree neighbors. Weights for imputation are determined by the number of detected genes. This method
works for large data sets (>100,000 cells) in under a minute.
Usage
KSoftImpute(E, dM = NULL, genes.to.use = NULL, verbose = FALSE)
Arguments
E |
A gene-by-sample count matrix (sparse matrix or matrix) with genes identified by their HUGO symbols. |
dM |
see ?CID.GetDistMat |
genes.to.use |
a character vector of genes to impute. Default is NULL. |
verbose |
If TRUE, code reports outputs. Default is FALSE. |
Value
An expression matrix (sparse matrix) with imputed values.
See Also
Signac
and SignacFast
Examples
## Not run:
# download single cell data for classification
file.dir = "https://cf.10xgenomics.com/samples/cell-exp/3.0.0/pbmc_1k_v3/"
file = "pbmc_1k_v3_filtered_feature_bc_matrix.h5"
download.file(paste0(file.dir, file), "Ex.h5")
# load data, process with Seurat
library(Seurat)
E = Read10X_h5(filename = "Ex.h5")
pbmc <- CreateSeuratObject(counts = E, project = "pbmc")
# run Seurat pipeline
pbmc <- SCTransform(pbmc, verbose = FALSE)
pbmc <- RunPCA(pbmc, verbose = FALSE)
pbmc <- RunUMAP(pbmc, dims = 1:30, verbose = FALSE)
pbmc <- FindNeighbors(pbmc, dims = 1:30, verbose = FALSE)
# get edges from default assay from Seurat object
default.assay <- Seurat::DefaultAssay(pbmc)
edges = pbmc@graphs[[which(grepl(paste0(default.assay, "_nn"), names(pbmc@graphs)))]]
# get distance matrix
dM = CID.GetDistMat(edges)
# run imputation
Z = KSoftImpute(E = E, dM = dM, verbose = TRUE)
## End(Not run)
[Package SignacX version 2.2.5 Index]