R: Find DEG between two groups

runPairwiseDEG {rliger}

R Documentation

Find DEG between two groups

Description

Find DEG between two groups. Two methods are supported: "wilcoxon" and "pseudoBulk". Wilcoxon rank sum test is performed on single-cell level, while pseudo-bulk method aggregates cells basing on biological replicates and calls bulk RNAseq DE methods, DESeq2 wald test. When real biological replicates are not available, pseudo replicates can be generated. Please see below for detailed scenario usage.

Usage

runPairwiseDEG(
  object,
  groupTest,
  groupCtrl,
  variable1 = NULL,
  variable2 = NULL,
  method = c("wilcoxon", "pseudoBulk"),
  usePeak = FALSE,
  useReplicate = NULL,
  nPsdRep = 5,
  minCellPerRep = 10,
  seed = 1,
  verbose = getOption("ligerVerbose", TRUE)
)

runMarkerDEG(
  object,
  conditionBy = NULL,
  splitBy = NULL,
  method = c("wilcoxon", "pseudoBulk"),
  useDatasets = NULL,
  usePeak = FALSE,
  useReplicate = NULL,
  nPsdRep = 5,
  minCellPerRep = 10,
  seed = 1,
  verbose = getOption("ligerVerbose", TRUE)
)

runWilcoxon(
  object,
  data.use = NULL,
  compare.method = c("clusters", "datasets")
)

Arguments

`object`	A liger object, with normalized data available
`groupTest`, `groupCtrl`, `variable1`, `variable2`	Condition specification. See `?runPairwiseDEG` section Pairwise DEG Scenarios for detail.
`method`	DEG test method to use. Choose from `"wilcoxon"` or `"pseudoBulk"`. Default `"wilcoxon"`
`usePeak`	Logical. Whether to use peak count instead of gene count. Only supported when ATAC datasets are involved. Default `FALSE`.
`useReplicate`	`cellMeta` variable of biological replicate annotation. Only used with `method = "pseudoBulk"`. Default `NULL` will create `nPsdRep` pseudo replicates per group.
`nPsdRep`	Number of pseudo replicates to create. Only used when `method = "pseudoBulk", useReplicate = NULL`. Default `5`.
`minCellPerRep`	Numeric, will not make pseudo-bulk for replicate with less than this number of cells. Default `10`.
`seed`	Random seed to use for pseudo-replicate generation. Default `1`.
`verbose`	Logical. Whether to show information of the progress. Default `getOption("ligerVerbose")` or `TRUE` if users have not set.
`conditionBy`	`cellMeta` variable(s). Marker detection will be performed for each level of this variable. Multiple variables will be combined. Default `NULL` uses default cluster.
`splitBy`	Split data by `cellMeta` variable(s) here and identify markers for `conditionBy` within each chunk. Default `NULL`.
`useDatasets`	Datasets to perform marker detection within. Default `NULL` will use all datasets.
`data.use`	Same as `useDatasets`.
`compare.method`	Choose from `"clusters"` (default) or `"datasets"`. `"clusters"` compares each cluster against all other cells, while `"datasets"` run within each cluster and compare each dataset against all other datasets.

Value

A data.frame with DEG information

Pairwise DEG Scenarios

Users can select classes of cells from a variable in cellMeta. variable1 and variable2 are used to specify a column in cellMeta, and groupTest and groupCtrl are used to specify existing classes from variable1 and variable2, respectively. When variable2 is missing, groupCtrl will be considered from variable1.

For example, when variable1 = "celltype" and variable2 = NULL, groupTest and groupCtrl should be valid cell types in object$celltype.

When variable1 is "celltype" and variable2 is "gender", groupTest should be a valid cell type from object$celltype and groupCtrl should be a valid class from object$gender.

When both variable1 and variable2 are missing, groupTest and groupCtrl should be valid index of cells in object.

Marker Detection Scenarios

Marker detection is generally performed in a one vs. rest manner. The grouping of such condition is specified by conditionBy, which should be a column name in cellMeta. When splitBy is specified as another variable name in cellMeta, the marker detection will be iteratively done for within each level of splitBy variable.

For example, when conditionBy = "celltype" and splitBy = NULL, marker detection will be performed by comparing all cells of "celltype_i" against all other cells, and etc.

When conditionBy = "celltype" and splitBy = "gender", marker detection will be performed by comparing "celltype_i" cells from "gender_j" against other cells from "gender_j", and etc.

Examples

# Compare between cluster "0" and cluster "1"
degStats <- runPairwiseDEG(pbmcPlot, groupTest = 0, groupCtrl = 1,
                           variable1 = "leiden_cluster")
# Compare between all cells from cluster "5" and
# all cells from dataset "stim"
degStats <- runPairwiseDEG(pbmcPlot, groupTest = "5", groupCtrl = "stim",
                           variable1 = "leiden_cluster",
                           variable2 = "dataset")
# Identify markers for each cluster. Equivalent to old version
# `runWilcoxon(method = "cluster")`
markerStats <- runMarkerDEG(pbmcPlot, conditionBy = "leiden_cluster")
# Identify dataset markers within each cluster. Equivalent to old version
# `runWilcoxon(method = "dataset")`.
markerStatsList <- runMarkerDEG(pbmcPlot, conditionBy = "dataset",
                                splitBy = "leiden_cluster")

[Package rliger version 2.0.1 Index]