SubSEA {psSubpathway}R Documentation

Subtype Set Enrichment Analysis (SubSEA)

Description

The SubSEA (Subtype Set Enrichment Analysis) method to mine the specific subpathways of each sample Subtype.

Usage

SubSEA(
  expr,
  input.cls = "",
  subpathwaylist = "Symbol",
  kcdf = "Gaussian",
  method = "gsva",
  min.sz = 1,
  max.sz = Inf,
  nperm = 100,
  fdr.th = 1,
  mx.diff = TRUE,
  parallel.sz = 0
)

Arguments

expr

Matrix of gene expression values (rows are genes, columns are samples).

input.cls

Input sample class vector (phenotype) file in CLS format.

subpathwaylist

Character string denoting the gene label of the subpahtway list is 'Entrezid' (default) or 'Symbol'. Users can also enter their own subpathway list data. This list should be consistent with the gene label in the input gene expression profile.

kcdf

Character string denoting the kernel to use during the non-parametric estimation of the cumulative distribution function of expression levels across samples when method="gsva". By default, 'kcdf="Gaussian"' which is suitable when input expression values are continuous, such as microarray fluorescent units in logarithmic scale, RNA-seq log-CPMs, log-RPKMs or log-TPMs. When input expression values are integer counts, such as those derived from RNA-seq experiments, then this argument should be set to 'kcdf="Poisson"'.

method

Method to employ in the estimation of subpathway enrichment scores per sample. By default,this is set to 'gsva' (Hänzelmann et al, 2013) and other options are 'ssgsea' (Barbie et al, 2009).

min.sz

Minimum size of the resulting subpathway.

max.sz

Maximum size of the resulting subpathway.

nperm

Number of random permutations (default: 100). In practice, the users can set their own values as needed, and more than 1000 times may be fine in general.

fdr.th

Cutoff value for FDR. The only subpathway with lower fdr.th are listed (default: 1).

mx.diff

Offers two approaches to calculate the sample enrichment score (SES) from the KS random walk statistic. 'mx.diff=FALSE': SES is calculated as the maximum distance of the random walk from 0. 'mx.diff=TRUE' (default): SES is calculated as the magnitude difference between the largest positive and negative random walk deviations.

parallel.sz

Number of processors to use when doing the calculations in parallel. If this argument is left with its default value (parallel.sz=0) then it will use all available core processors unless we set this argument with a smaller number.

Details

SubSEA

This function calculates the subpathway activity profile based on the gene expression profile and subpathway list by 'gsva' or 'ssgssea'. Then we calculate the sample enrichment score (SES) of each subpathway by Subtype Set Enrichment Analysis (SubSEA).We permute the gene labels and recompute the SES for the permuted data, which generates a null distribution for the SES.The P-value and the FDR value are calculated according to the perturbation analysis.

Value

A list containing the results of the SubSEA and the subpathway activity profile.

Author(s)

Xudong Han, Junwei Han, Qingfei Kong

Examples

# load depend package.
require(GSVA)
require(parallel)
# get breast cancer disease subtype gene expression profile.
Bregenematrix<-get("Subgenematrix")
# get path of the sample disease subtype files.
Subtypelabels<- system.file("extdata", "Sublabels.cls", package = "psSubpathway")
# SubSEA(Bregenematrix,input.cls=Subtypelabels,nperm=50,fdr.th=0.01,parallel.sz=2)
# get the result of the SubSEA function
SubSEAresult<-get("Subspwresult")
str(SubSEAresult)
head(SubSEAresult$Basal)

# Simulated gene matrix
genematrix <- matrix(rnorm(500*40), nrow=500, dimnames=list(1:500, 1:40))
# Construct subpathway list data.
subpathwaylist <- as.list(sample(2:100, size=20, replace=TRUE))
subpathwaylist <- lapply(subpathwaylist, function(n) sample(1:500, size=n, replace=FALSE))
names(subpathwaylist)<-c(paste(rep("spw",20),c(1:20)))
# Construct sample labels data.
subtypelabel<-list(phen=c("subtype1","subtype2","subtype3","subtype4"),
                   class.labes=c(rep("subtype1",10),rep("subtype2",10),
                                 rep("subtype3",10),rep("subtype4",10)))
SubSEAcs<-SubSEA(genematrix,subtypelabel,subpathwaylist,nperm=0,parallel.sz=1)



[Package psSubpathway version 0.1.3 Index]