SEMgsa {SEMgraph}R Documentation

SEM-based gene set analysis

Description

Gene Set Analysis (GSA) via self-contained test for group effect on signaling (directed) pathways based on SEM. The core of the methodology is implemented in the RICF algorithm of SEMrun(), recovering from RICF output node-specific group effect p-values, and Brown’s combined permutation p-values of node activation and inhibition.

Usage

SEMgsa(g = list(), data, group, method = "BH", alpha = 0.05, n_rep = 1000, ...)

Arguments

g

A list of pathways to be tested.

data

A matrix or data.frame. Rows correspond to subjects, and columns to graph nodes (variables).

group

A binary vector. This vector must be as long as the number of subjects. Each vector element must be 1 for cases and 0 for control subjects.

method

Multiple testing correction method. One of the values available in p.adjust. By default, method is set to "BH" (i.e., Benjamini-Hochberg correction).

alpha

Gene set test significance level (default = 0.05).

n_rep

Number of randomization replicates (default = 1000).

...

Currently ignored.

Details

For gaining more biological insights into the functional roles of pre-defined subsets of genes, node perturbation obtained from RICF fitting has been combined with up- or down-regulation of genes from KEGG to obtain overall pathway perturbation as follows:

Value

A list of 2 objects:

  1. "gsa", A data.frame reporting the following information for each pathway in the input list:

    • "No.nodes", pathway size (number of nodes);

    • "No.DEGs", number of differential espression genes (DEGs) within the pathway, after multiple test correction with one of the methods available in p.adjust;

    • "pert", pathway perturbation status (see details);

    • "pNA", Brown's combined P-value of pathway node activation;

    • "pNI", Brown's combined P-value of pathway node inhibition;

    • "PVAL", Bonferroni combined P-value of pNA, and pNI; i.e., 2* min(pNA, PNI);

    • "ADJP", Adjusted Bonferroni P-value of pathway perturbation; i.e., min(No.pathways * PVAL; 1).

  2. "DEG", a list with DEGs names per pathways.

Author(s)

Mario Grassi mario.grassi@unipv.it

References

Grassi, M., Tarantino, B. SEMgsa: topology-based pathway enrichment analysis with structural equation models. BMC Bioinformatics 23, 344 (2022). https://doi.org/10.1186/s12859-022-04884-8

Examples


## Not run: 

# Nonparanormal(npn) transformation
als.npn <- transformData(alsData$exprs)$data

# Selection of FTD-ALS pathways from kegg.pathways.Rdata

paths.name <- c("MAPK signaling pathway",
                "Protein processing in endoplasmic reticulum",
                "Endocytosis",
                "Wnt signaling pathway",
                "Neurotrophin signaling pathway",
                "Amyotrophic lateral sclerosis")

j <- which(names(kegg.pathways) %in% paths.name)

GSA <- SEMgsa(kegg.pathways[j], als.npn, alsData$group,
              method = "bonferroni", alpha = 0.05,
              n_rep = 1000)
GSA$gsa
GSA$DEG


## End(Not run)


[Package SEMgraph version 1.2.1 Index]