pathEnrich {diffEnrich} | R Documentation |
pathEnrich
Description
This function takes the list generated in get_kegg
as well as a vector
of NCBI (ENTREZ) geneIDs, and identifies significantly enriched KEGG pathways using
a Fisher's Exact Test. Unadjusted p-values as well as FDR corrected p-values are
calculated.
Usage
pathEnrich(gk_obj, gene_list, method = "BH", cutoff = 0.05, N = 2)
## S3 method for class 'pathEnrich'
print(x, ...)
## S3 method for class 'pathEnrich'
summary(object, ...)
Arguments
gk_obj |
list. Object genrated from |
gene_list |
Vector. Vector of NCBI (ENTREZ) geneIDs. |
method |
Character. Character string telling |
cutoff |
Numeric. The p-value threshold to be used as the cutoff when determining statistical significance, and used to filter list of significant pathways. |
N |
Numeric. The number of genes from the gene list that must be present in a KEGG pathway in order for that pathway to be retained and tested. |
x |
object of class |
... |
Unused |
object |
object of class |
Details
This function may not always use the complete list of genes provided by the user.
Specifically, it will only use the genes from the list provided that are also in
the most current species list pulled from the KEGG REST API, or from the older data KEGG
loaded by the user. The 'cutoff' only filters the list of pathways provided in the 'sig_paths'
list item. It is not used to filter the 'enrich_table' list object. S3 generic functions for print
and summary
are
provided. The print
function prints the results table as a tibble
, and the
summary
function returns the number of pathways that reached statistical significance,
as well as their descriptions, the number of genes used from the KEGG data base, the KEGG species, and the
method used for multiple testing correction, and the p-value cutoff required for reaching statistical significance.
Value
A list object of class pathEnrich
that contains 6 items:
- species
The species used in enrichment
- padj
The method used to correct for multiple testing
- sig_paths
The KEGG pathways the reached statistical significance after multiple testing correction.
- cutoff
The p-value threshold to be used as the cutoff when determining statistical significance, and used to filter final results data set.
- N
The number of genes from the gene list that must be present in a KEGG pathway in order for that pathway to be retained and tested.
- enrich_table
A data frame that summarizes the results of the pathway analysis and contains the following variables:
- KEGG_PATHWAY_ID
KEGG Pathway Identifier
- KEGG_PATHWAY_description
Description of KEGG Pathway (provided by KEGG)
- KEGG_PATHWAY_cnt
Number of Genes in KEGG Pathway
- KEGG_PATHWAY_in_list
Number of Genes from gene list in KEGG Pathway
- KEGG_DATABASE_cnt
Number of Genes in KEGG Database
- KEGG_DATABASE_in_list
Number of Genes from gene list in KEGG Database
- expected
Expected number of genes from list to be in KEGG pathway by chance (i.e., not enriched)
- enrich_p
P-value for enrichment of list genes related to KEGG pathway
- p_adj
False Discovery Rate (Benjamini and Hochberg) to account for multiple testing across KEGG pathways
- fold_enrichment
KEGG_PATHWAY_in_list/expected
Examples
list1_pe <- pathEnrich(gk_obj = kegg, gene_list = geneLists$list1)
## Not run:
list2_pe <- pathEnrich(gk_obj = kegg, gene_list = geneLists$list2, method = 'none', N = 4)
## End(Not run)