selectTopFeatures {CytoSimplex}R Documentation

Pick top differentially presented features for similarity calculation

Description

Performs wilcoxon rank-sum test on input matrix. While clusterVar and vertices together defines the groups of cells to be set as terminals of the simplex, this function will test each of these groups against the rest of the cells. The U-Statistics (statistic), p-value (pval) and adjusted p-value (padj), together with average presence in group (avgExpr), log fold-change (logFC), AUC (auc), percentage in group (pct_in) and percentage out of group (pct_out) will be calculated. Set returnStats = TRUE to return the full statistics table.

Top features are selected by sorting primarily on adjusted p-value, and secondarily on log fold-change, after filtering for up-regulated features.

Usage

selectTopFeatures(x, clusterVar, vertices, ...)

## Default S3 method:
selectTopFeatures(
  x,
  clusterVar,
  vertices,
  nTop = 30,
  processed = FALSE,
  lfcThresh = 0.1,
  returnStats = FALSE,
  ...
)

## S3 method for class 'Seurat'
selectTopFeatures(
  x,
  clusterVar = NULL,
  vertices,
  assay = NULL,
  layer = "counts",
  processed = FALSE,
  ...
)

## S3 method for class 'SingleCellExperiment'
selectTopFeatures(
  x,
  clusterVar = NULL,
  vertices,
  assay.type = "counts",
  processed = FALSE,
  ...
)

Arguments

x

Dense or sparse matrix, observation per column. Preferrably a raw count matrix. Alternatively, a Seurat object or a SingleCellExperiment object.

clusterVar

A vector/factor assigning the cluster variable to each column of the matrix object. For "Seurat" method, NULL (default) for Idents(x), or a variable name in meta.data slot. For "SingleCellExperiment" method, NULL (default) for colLabels(x), or a variable name in colData slot.

vertices

Vector of cluster names that will be used for plotting. Or a named list that groups clusters as a terminal vertex. There must not be any overlap between groups.

...

Arguments passed to methods.

nTop

Number of top differentially presented features per terminal. Default 30.

processed

Logical. Whether the input matrix is already processed. TRUE will bypass internal preprocessing and input matrix will be directly used for rank-sum calculation. Default FALSE and raw count input is recommended.

lfcThresh

Threshold on log fold-change to identify up-regulated features. Default 0.1.

returnStats

Logical. Whether to return the full statistics table rather then returning the selected genes. Default FALSE

assay

Assay name of the Seurat object to be used. Default NULL.

layer

For "Seurat" method, which layer of the assay to be used. Default "counts".

assay.type

Assay name of the SingleCellExperiment object to be used. Default "counts".

Value

When returnStats = FALSE (default), a character vector of at most length(unique(vertices))*nTop feature names. When returnStats = TRUE, a data.frame of wilcoxon rank sum test statistics.

Examples

selectTopFeatures(rnaRaw, rnaCluster, c("OS", "RE"))

# Seurat example
library(Seurat)
srt <- CreateSeuratObject(rnaRaw)
Idents(srt) <- rnaCluster
gene <- selectTopFeatures(srt, vertices = c("OS", "RE"))


# SingleCellExperiment example
library(SingleCellExperiment)
sce <- SingleCellExperiment(assays = list(counts = rnaRaw))
colLabels(sce) <- rnaCluster
gene <- selectTopFeatures(sce, vertices = c("OS", "RE"))


[Package CytoSimplex version 0.1.1 Index]