symbol_to_entrez {dnapath}R Documentation

Obtain entrezgene IDs for gene symbols

Description

Uses biomaRt (Durinck et al. 2009) to map entrezgene IDs to gene symbols for a given species. The output of this function can be used in rename_genes.

Usage

symbol_to_entrez(
  x,
  species,
  symbol_name = NULL,
  dir_save = tempdir(),
  verbose = TRUE
)

Arguments

x

A vector of gene symbols.

species

The species used to obtain the entrezgene IDs. For example: "Homo sapiens", "m musculus", "C. elegans", or "S cerevisiae". "Human" and "mouse" can also be used and will be converted to the correct species name.

symbol_name

The type of gene symbol to use. If NULL, then "hgnc_symbol" is used for HGNC symbols, unless species is "mmusculus", in which case "mgi_symbol" is used.

dir_save

The directory to store annotation reference. Future calls to this function will use the stored annotations. This speeds up the operation and allows for reproducibility in the event that the biomaRt database is updated. Set to NULL to disable. By default, it uses a temporary directory to store files during the R session.

verbose

Set to FALSE to avoid messages.

Details

If entrezgene IDs are used in a dnapath_list or dnapath object, or a pathway list, then get_genes can be used to extract them and used for the x argument here.

Value

A data frame with two columns: the first contains the original gene symbols, and the second contains a corresponding entrezgene ID. If a gene symbol is not mapped to an entrezgene ID, the entrezgene ID is set to -1.

Note

Internet connection is required to connect to biomaRt. If unavailable, the default biomart and default species contained in the package is used, but this may not match the desired species.

It is assumed that x contains MGI symbols when the biomart species is "Mus musculus" and HGNC symbols otherwise.

References

Durinck S, Spellman PT, Birney E, Huber W (2009). “Mapping Identifiers for the Integration of Genomic Datasets with the R/Bioconductor Package biomaRt.” Nature Protocols, 4, 1184–1191.

See Also

entrez_to_symbol, get_genes

Examples


# Convert a set of gene symbols to entrezgene IDs.
# Note that not all may have mapping (such as "MSX" in this example).
gene_mat <- symbol_to_entrez(c("SOX2", "SEMA3E", "COL11A1", "UBB", "MSX"),
                             species = "human")


[Package dnapath version 0.7.4 Index]