Signac {SignacX}R Documentation

Classification of cellular phenotypes in single cell data

Description

Signac trains and then uses an ensemble of neural networks to classify cellular phenotypes using an expression matrix or Seurat object. The neural networks are trained with the HPCA training data using only features that are present in both the single cell and HPCA training data set. Signac returns annotations at each level of the classification hierarchy, which are then converted into cell type labels using GenerateLabels. For a faster alternative, try SignacFast, which uses pre-computed neural network models.

Usage

Signac(
  E,
  R = "default",
  spring.dir = NULL,
  N = 100,
  num.cores = 1,
  threshold = 0,
  smooth = TRUE,
  impute = TRUE,
  verbose = TRUE,
  do.normalize = TRUE,
  return.probability = FALSE,
  hidden = 1,
  set.seed = TRUE,
  seed = "42",
  graph.used = "nn"
)

Arguments

E

a sparse gene (rows) by cell (column) matrix, or a Seurat object. Rows are HUGO symbols.

R

Reference data. If 'default', R is set to GetTrainingData_HPCA().

spring.dir

If using SPRING, directory to categorical_coloring_data.json. Default is NULL.

N

Number of machine learning models to train (for nn and svm). Default is 100.

num.cores

Number of cores to use. Default is 1.

threshold

Probability threshold for assigning cells to "Unclassified." Default is 0.

smooth

if TRUE, smooths the cell type classifications. Default is TRUE.

impute

if TRUE, gene expression values are imputed prior to cell type classification (see KSoftImpute). Default is TRUE.

verbose

if TRUE, code will report outputs. Default is TRUE.

do.normalize

if TRUE, cells are normalized to the mean library size. Default is TRUE.

return.probability

if TRUE, returns the probability associated with each cell type label. Default is TRUE.

hidden

Number of hidden layers in the neural network. Default is 1.

set.seed

If true, seed is set to ensure reproducibility of these results. Default is TRUE.

seed

if set.seed is TRUE, seed is set to 42.

graph.used

If using Seurat object by default, Signac uses the nearest neighbor graph in the graphs field of the Seurat object. Other options are "wnn" to use weighted nearest neighbors, as well as "snn" to use shared nearest neighbors.

Value

A list of character vectors: cell type annotations (L1, L2, ...) at each level of the hierarchy as well as 'clusters' for the Louvain clustering results.

See Also

SignacFast, a faster alternative that only differs from Signac in nuanced T cell phenotypes.

Examples

## Not run: 
# download single cell data for classification
file.dir = "https://cf.10xgenomics.com/samples/cell-exp/3.0.0/pbmc_1k_v3/"
file = "pbmc_1k_v3_filtered_feature_bc_matrix.h5"
download.file(paste0(file.dir, file), "Ex.h5")

# load data, process with Seurat
library(Seurat)
E = Read10X_h5(filename = "Ex.h5")
pbmc <- CreateSeuratObject(counts = E, project = "pbmc")

# run Seurat pipeline
pbmc <- SCTransform(pbmc, verbose = FALSE)
pbmc <- RunPCA(pbmc, verbose = FALSE)
pbmc <- RunUMAP(pbmc, dims = 1:30, verbose = FALSE)
pbmc <- FindNeighbors(pbmc, dims = 1:30, verbose = FALSE)

# classify cells
labels = Signac(E = pbmc)
celltypes = GenerateLabels(labels, E = pbmc)

# add labels to Seurat object, visualize
pbmc <- Seurat::AddMetaData(pbmc, metadata=celltypes$CellTypes_novel, col.name = "immmune")
pbmc <- Seurat::SetIdent(pbmc, value='immmune')
DimPlot(pbmc)

# save results
saveRDS(pbmc, "example_pbmcs.rds")

## End(Not run)

[Package SignacX version 2.2.5 Index]