R: Function to Calculate Single Sample Gene Set Scores without...

ssGSA {ssdGSA}

R Documentation

Function to Calculate Single Sample Gene Set Scores without Direction Matrix

Description

This function is to calculate traditional single sample gene set scores without considering the direction of each gene.

Usage

ssGSA(
  Data,
  Gene_sets,
  GSA_weight = "equal_weighted",
  GSA_weighted_by = "sum.ES",
  GSA_method = "gsva",
  min.sz = 1,
  max.sz = 2000,
  mx.diff = TRUE
)

Arguments

`Data`	Data matrix of gene expressions with gene ID as row names and columns corresponding to different samples.
`Gene_sets`	A list of gene sets with gene set names as component names, and each component is a vector of gene ID.
`GSA_weight`	Method to calculate weight in GSA. By default this is set to "group_weighted". Other option is "equal_weighted".
`GSA_weighted_by`	When "group_weighted" is chosen to calculate GSA_weight, further specifications are need to specify how group weights are calculated. By default this is set to "avg.ES" (average of group ES). Other options are "sum.ES" (sum of group ES) and "median.ES" (median of group ES).
`GSA_method`	Method to employ in the estimation of gene-set enrichment scores per sample. By default this is set to "gsva" (Hanzelmann et al, 2013). Other options are "ssgsea" (Barbie et al, 2009), "zscore" (Lee et al, 2008), "avg.exprs" (average value of gene expressions in the gene set), and "median.exprs" (median of gene expressions in the gene set).
`min.sz`	GSVA parameter to define the minimum size of the resulting gene sets. By default this is set to 1.
`max.sz`	GSVA parameter to define the maximum size of the resulting gene sets. By default this is set to 2000.
`mx.diff`	GSVA parameter to offer two approaches to calculate the enrichment statistic from the KS random walk statistic. mx.diff = FALSE: enrichment statistic is calculated as the maximum distance of the random walk from 0. mx.diff=TRUE (default): enrichment statistic is calculated as the magnitude difference between the largest positive and negative random walk deviations.

Details

Single sample directional gene set analysis inherits the standard gene set variation analysis(GSVA) method, but also provides the option to use summary statistics from any analysis (disease vs healthy, LS vs NL, etc..) input to define the direction of gene sets used for directional gene set score calculation for a given disease or directional function. However, when the directionality information is missing for genes, gene set scores from traditional single sample gene set analysis will be returned.

Value

Matrix of gene set scores (without considering directionality information of each gene) with rows corresponding to gene sets and columns corresponding to different samples will be return.

References

Xingpeng Li, Qi Qian. ssdGSA - Single sample direction gene set analysis tool.

Barbie, D.A. et al. Systematic RNA interference reveals that oncogenic KRAS-driven cancers require TBK1. Nature, 462(5):108-112, 2009.

Hanzelmann, S., Castelo, R. and Guinney, J. GSVA: Gene set variation analysis for microarray and RNA-Seq data. BMC Bioinformatics, 14:7, 2013.

Lee, E. et al. Inferring pathway activity toward precise disease classification. PLoS Comp Biol, 4(11):e1000217, 2008.

Tomfohr, J. et al. Pathway level analysis of gene expression using singular value decomposition. BMC Bioinformatics, 6:225, 2005.