get_kegg {diffEnrich}R Documentation

get_kegg

Description

This function calls an internal helper function that connects to the KEGG API, downloads, and stores ncbi gene ID data, KEGG pathway descriptions, and species specific data. Currently, this function supports Human, Mouse, and Rat. Files will be written to the working directory unless otherwise specified by the user.

Usage

get_kegg(species, read = FALSE, path = NULL, date, release)

Arguments

species

character. The species to use in kegg data pull

read

logical. Should get_kegg read in files from previous call. If TRUE, all 3 files generated by get_kegg must be in the same directory and the user must provide a file path that points to that directory.

path

character. A character string describing the path to write out KEGG API data sets. If not provided, defaults to current working directory.

date

character. A character string describing the date that was used to time stamp files from previous call. Must be formatted like YYYY-MM-DD.

release

character. A character string describing the KEGG release that was used to time stamp files from previous call (e.g. "90" or "92")

Details

the get_kegg function is used to connect to the KEGG REST API and download the data sets required to perform downstream analysis. Currently, this function supports three species, and recognizes the KEGG code for Homo sapiens (‘hsa’), Mus musculus (‘mmu’), and Rattus norvegicus (‘rno’). For a given species, three data sets are generated: 1) Because the user must provide their own gene lists in downstream analysis using ENTREZ gene IDs, the data set maps NCBI/ENTREZ gene IDs to KEGG gene IDs, 2) a data set that maps KEGG gene IDs to their respective KEGG pathway IDs, and 3) a data set that maps KEGG pathway IDs to their respective pathway descriptions. This function allows the user save versioned (based on KEGG release) and time-stamped text files of the three data sets described above. In addition to these flat files, get_kegg() will also create a named list with the three relevant KEGG data sets. The names of this list will describe the data set.

Table 1. Description of get_kegg list object

get_kegg_list_object Object_description
ncbi_to_kegg ncbi gene ID <-- mapped to --> KEGG gene ID
kegg_to_pathway KEGG gene ID <-- mapped to --> KEGG pathway ID
pathway_to_species KEGG pathway ID <-- mapped to --> KEGG pathway species description

Value

kegg_out: A named list of the data pulled from kegg api when the function was run. This may be different if the function is run at different times. For reproducible results, use text files generated by function that include the date they were pulled.

ncbi_to_kegg

ncbi_to_kegg mappings as class data.frame

kegg_to_pathway

kegg_to_pathway mappings as class data.frame

pathway_to_species

pathway_to_species mappings as class data.frame

Examples

## Not run: 
kegg <- get_kegg(species = "rno")

## End(Not run)
## Not run: 
kegg <- get_kegg(species = "mmu", path = "usr/data/out/")
kegg <- get_kegg(species = "mmu", path = "usr/data/out/",
read = TRUE,
date = "2019-09-30",
release = "92")

## End(Not run)


[Package diffEnrich version 0.1.2 Index]