common_genes {tidyestimate} | R Documentation |
Genes shared between six expression platforms
Description
As the ESTIMATE model was trained on a specific set of genes,
only those within this dataset should be included before running
estimate_scores
.
These are the genes common to 6 platforms:
- Affymetrix HG-U133Plus2.0
- Affymetrix HT-HG-U133A
- Affymetrix Human X3P
- Agilent 4x44K (G4112F)
- Agilent G4502A
- Illumina HiSeq RNA sequence
The Entrez IDs for the original 10412 genes were matched to HGNC symbols
using biomaRt
. Duplicates and blank entries were filtered. As some
have now been discovered to be pseudogenes or have been deprecated, 22
genes (at time of writing, June 2021) that were in the ESTIMATE package do
not exist here.
As one gene can have multiple synonyms/aliases, and there is only one alias per line, the number of rows in the data frame (26339) does not reflect the number of unique genes in the dataset (10391).
Usage
common_genes
Format
A data frame with 26339 rows and 3 variables:
- entrezgene_id
Entrez id of the gene
- hgnc_symbol
Human Genome Organisation (HUGO) Gene Nomenclature Committee symbol
- external_synonym
A synonym/alias a given gene may go by or previously went by
Details
The ESTIMATE model was trained on a set of genes shared between six expression profiling platforms. Those genes are listed in this dataset.