scGate {scGate} | R Documentation |
Filter single-cell data by cell type
Description
Apply scGate to filter specific cell types in a query dataset
Usage
scGate(
data,
model,
pos.thr = 0.2,
neg.thr = 0.2,
assay = NULL,
slot = "data",
ncores = 1,
BPPARAM = NULL,
seed = 123,
keep.ranks = FALSE,
reduction = c("calculate", "pca", "umap", "harmony"),
min.cells = 30,
nfeatures = 2000,
pca.dim = 30,
param_decay = 0.25,
maxRank = 1500,
output.col.name = "is.pure",
k.param = 30,
smooth.decay = 0.1,
smooth.up.only = FALSE,
genes.blacklist = "default",
return.CellOntology = TRUE,
multi.asNA = FALSE,
additional.signatures = NULL,
save.levels = FALSE,
verbose = FALSE,
progressbar = T
)
Arguments
data |
Seurat object containing a query data set - filtering will be applied to this object |
model |
A single scGate model, or a list of scGate models. See Details for this format |
pos.thr |
Minimum UCell score value for positive signatures |
neg.thr |
Maximum UCell score value for negative signatures |
assay |
Seurat assay to use |
slot |
Data slot in Seurat object to calculate UCell scores |
ncores |
Number of processors for parallel processing |
BPPARAM |
A [BiocParallel::bpparam()] object that tells scGate how to parallelize. If provided, it overrides the 'ncores' parameter. |
seed |
Integer seed for random number generator |
keep.ranks |
Store UCell rankings in Seurat object. This will speed up calculations if the same object is applied again with new signatures. |
reduction |
Dimensionality reduction to use for knn smoothing. By default, calculates a new reduction
based on the given |
min.cells |
Minimum number of cells to cluster or define cell types |
nfeatures |
Number of variable genes for dimensionality reduction |
pca.dim |
Number of principal components for dimensionality reduction |
param_decay |
Controls decrease in parameter complexity at each iteration, between 0 and 1.
|
maxRank |
Maximum number of genes that UCell will rank per cell |
output.col.name |
Column name with 'pure/impure' annotation |
k.param |
Number of nearest neighbors for knn smoothing |
smooth.decay |
Decay parameter for knn weights: (1-decay)^n |
smooth.up.only |
If TRUE, only let smoothing increase signature scores |
genes.blacklist |
Genes blacklisted from variable features. The default loads the list of genes in |
return.CellOntology |
If TRUE Cell ontology name and id are returned as additional metadata columns when running multiple models. |
multi.asNA |
How to label cells that are "Pure" for multiple annotations: "Multi" (FALSE) or NA (TRUE) |
additional.signatures |
A list of additional signatures, not included in the model, to be evaluated (e.g. a cycling signature). The scores for this list of signatures will be returned but not used for filtering. |
save.levels |
Whether to save in metadata the filtering output for each gating model level |
verbose |
Verbose output |
progressbar |
Whether to show a progressbar or not |
Details
Models for scGate are data frames where each line is a signature for a given filtering level.
A database of models can be downloaded using the function get_scGateDB
.
You may directly use the models from the database, or edit one of these models to generate your own custom gating model.
Multiple models can also be evaluated at once, by running scGate with a list of models. Gating for each individual model is
returned as metadata, with a consensus annotation stored in scGate_multi
metadata field. This allows using scGate as a
multi-class classifier, where only cells that are "Pure" for a single model are assigned a label, cells that are "Pure" for
more than one gating model are labeled as "Multi", all others cells are annotated as NA.
Value
A new metadata column is.pure
is added to the query Seurat object, indicating which cells passed the scGate filter.
The active.ident
is also set to this variable.
See Also
load_scGate_model
get_scGateDB
plot_tree
Examples
### Test using a small toy set
data(query.seurat)
# Define basic gating model for B cells
my_scGate_model <- gating_model(name = "Bcell", signature = c("MS4A1"))
query.seurat <- scGate(query.seurat, model = my_scGate_model, reduction="pca")
table(query.seurat$is.pure)
### Test with larger datasets
library(Seurat)
testing.datasets <- get_testing_data(version = 'hsa.latest')
seurat_object <- testing.datasets[["JerbyArnon"]]
# Download pre-defined models
models <- get_scGateDB()
seurat_object <- scGate(seurat_object, model=models$human$generic$PanBcell)
DimPlot(seurat_object)
seurat_object_filtered <- subset(seurat_object, subset=is.pure=="Pure")
### Run multiple models at once
models <- get_scGateDB()
model.list <- list("Bcell" = models$human$generic$Bcell,
"Tcell" = models$human$generic$Tcell)
seurat_object <- scGate(seurat_object, model=model.list)
DimPlot(seurat_object, group.by = "scGate_multi")