CompareValues {MARVEL}R Documentation

Differential splicing and gene expression analysis

Description

Performs differential splicing and gene expression analysis between 2 groups of cells. This is a wrapper function for CompareValues.PSI and CompareValues.Exp functions.

Usage

CompareValues(
  MarvelObject,
  cell.group.g1 = NULL,
  cell.group.g2 = NULL,
  downsample = FALSE,
  seed = 1,
  min.cells = 25,
  pct.cells = NULL,
  method = NULL,
  nboots = 1000,
  n.permutations = 1000,
  method.adjust = "fdr",
  level,
  event.type = NULL,
  show.progress = TRUE,
  annotate.outliers = TRUE,
  n.cells.outliers = 10,
  assign.modality = TRUE,
  custom.gene_ids = NULL,
  psi.method = NULL,
  psi.pval = NULL,
  psi.delta = NULL,
  method.de.gene = NULL,
  method.adjust.de.gene = NULL,
  mast.method = "bayesglm",
  mast.ebayes = TRUE
)

Arguments

MarvelObject

Marvel object. S3 object generated from TransformExpValues function.

cell.group.g1

Vector of character strings. Cell IDs corresponding to Group 1 (reference group).

cell.group.g2

Vector of character strings. Cell IDs corresponding to Group 2.

downsample

Logical value. If set to TRUE, the number of cells in each cell group will be downsampled to the sample size of the smaller cell group so that both cell groups will have the sample size prior to differential expression analysis. Default is FALSE.

seed

Numeric value. The seed number for the random number generator to ensure reproducibility during during down-sampling of cells when downsample set to TRUE, during permutation testing when method set to "permutation", and during modality assignment which will be performed automatically.

min.cells

Numeric value. The minimum no. of cells expressing the splicing event or genes for the event or genes to be included for differential splicing analysis.

pct.cells

Numeric value. The minimum percentage of cells expressing the splicing event or genes for the event or genes to be included for differential splicing analysis. If pct.cells is specified, then pct.cells will be used as threshold instead of min.cells.

method

Character string. Statistical test to compare the 2 groups of cells. "ks", "kuiper", "ad", "dts", "wilcox", and "t.test" for Kolmogorov-Smirnov, Kuiper, Anderson-Darling, DTS, Wilcox, and t-test, respectively. Additional "mast" option is available for differential gene expression analysis. If "mast" is specified, the log2fc and p-values will be corrected using the gene detection rate as per the MAST package tutorial.

nboots

Numeric value.Only applicable when level set to "splicing". When method set to "dts", the number of bootstrap iterations for computing the p-value.

n.permutations

Numeric value. Only applicable when level set to "splicing". When method set to "permutation", this argument indicates the number of permutations to perform for generating the null distribution for subsequent p-value inference. Default is 1000 times.

method.adjust

Character string. Adjust p-values for multiple testing. Options available as per p.adjust function.

level

Character string. Indicate "splicing" or "gene" for differential splicing or gene expression analysis, respectively.

event.type

Character string. Only applicable when level set to "splicing". Indicate which splicing event type to include for analysis. Can take value "SE", "MXE", "RI", "A5SS", or "A3SS" which represents skipped-exon (SE), mutually-exclusive exons (MXE), retained-intron (RI), alternative 5' splice site (A5SS), and alternative 3' splice site (A3SS), respectively.

show.progress

Logical value. If set to TRUE, progress bar will be displayed so that users can estimate the time needed for differential analysis. Default value is TRUE.

annotate.outliers

Numeric value. Only applicable when level set to "splicing". When set to TRUE, statistical difference in PSI values between the two cell groups that is driven by outlier cells will be annotated.

n.cells.outliers

Numeric value. Only applicable when level set to "splicing". When method set to "dts", the minimum number of cells with non-1 or non-0 PSI values for included-to-included or excluded-to-excluded modality change, respectively. The p-values will be re-coded to 1 when both cell groups have less than this minimum number of cells. This is to avoid false positive results.

assign.modality

Logical value. Only applicable when level set to "splicing". If set to TRUE (default), modalities will be assigned to each cell group.

custom.gene_ids

Character string. Only applicable when level set to "gene". Instead of specified the genes to include for DE analysis with min.cells, users may input a custom vector of gene IDs to include for DE analysis.

psi.method

Vector of character string(s). Only applicable when level set to "gene.spliced" and when CompareValues function has been ran with level set to "splicing" earlier. To include significant events from these method(s) for differential gene expression analysis.

psi.pval

Vector of numeric value(s). Only applicable when level set to "gene.spliced" and when CompareValues function has been ran with level set to "splicing" earlier. The adjusted p-value, below which, the splicing event is considered differentially spliced, and the corresponding genes will be included for differential gene expression analysis.

psi.delta

Numeric value. Only applicable when level set to "gene.spliced" and when CompareValues function has been ran with level set to "splicing" earlier. The absolute difference in mean PSI values between cell.group.g1 and cell.group.g1, above which, the splicing event is considered differentially spliced, and the corresponding genes will be included for differential gene expression analysis.

method.de.gene

Character string. Only applicable when level set to "gene.spliced" and when CompareValues function has been ran with level set to "splicing" earlier. Same as method.

method.adjust.de.gene

Character string. Only applicable when level set to "gene.spliced" and when CompareValues function has been ran with level set to "splicing" earlier. Same as method.adjust.

mast.method

Character string. Only applicable when level set to "gene" or "gene.spliced". As per the method option of the zlm function from the MAST package. Default is "bayesglm", other options are "glm" and "glmer".

mast.ebayes

Logical value. Only applicable when level set to "gene" or "gene.spliced". As per the ebayes option of the zlm function from the MAST package. Default is TRUE.

Value

An object of class S3 containing with new slot MarvelObject$DE$PSI$Table[["method"]] or MarvelObject$DE$Exp$Table when level option specified as "splicing" or "gene", respectively.

Examples

marvel.demo <- readRDS(system.file("extdata/data", "marvel.demo.rds", package="MARVEL"))

# Define cell groups for analysis
df.pheno <- marvel.demo$SplicePheno
cell.group.g1 <- df.pheno[which(df.pheno$cell.type=="iPSC"), "sample.id"]
cell.group.g2 <- df.pheno[which(df.pheno$cell.type=="Endoderm"), "sample.id"]

# DE
marvel.demo <- CompareValues(MarvelObject=marvel.demo,
                             cell.group.g1=cell.group.g1,
                             cell.group.g2=cell.group.g2,
                             min.cells=5,
                             method="t.test",
                             method.adjust="fdr",
                             level="splicing",
                             event.type=c("SE", "MXE", "RI", "A5SS", "A3SS", "AFE", "ALE"),
                             show.progress=FALSE
                             )

# Check output
head(marvel.demo$DE$PSI$Table[["ad"]])

[Package MARVEL version 1.4.0 Index]