PS {canprot}R Documentation

Retrieve Phylostrata for Given UniProt IDs


Retrieves the phylostrata for protein-coding genes according to Liebeskind et al. (2016) or Trigos et al. (2017).


  PS(uniprot, source = "TPPG17")



character, UniProt accession numbers


character, TPPG17 or LMM16


The phylostratum for each protein is found by matching the UniProt ID in one of these data files:


This file has columns GeneID (gene name), Entrez, Entry, and Phylostrata. Except for Entry, the values are from Dataset S1 of Trigos et al. (2017). UniProt acession numbers in Entry were generated using the UniProt mapping tool first for Entrez, followed by GeneID for the unmatched genes. Entry is NA for genes that remain unmatched to any proteins after both mapping steps.


This file has columns UniProt, modeAge, and PS. The data are from file main_HUMAN.csv in Gene-Ages v1.0 (; Liebeskind et al. (2016)). The modeAges were converted to phylostrata values 1-8 (PS column) in this order: Cellular_organisms, Euk_Archaea, Euk+Bac, Eukaryota, Opisthokonta, Eumetazoa, Vertebrata, Mammalia.


Trigos, A. S. and Pearson, R. B. and Papenfuss, A. T. and Goode, D. L. (2017) Altered interactions between unicellular and multicellular genes drive hallmarks of transformation in a diverse range of solid tumors. Proc. Natl. Acad. Sci. 114, 6406–6411. doi: 10.1073/pnas.1617743114

Liebeskind, B. J. and McWhite, C. D. and Marcotte, E. M. (2016) Towards consensus gene ages. Genome Biol. Evol. 8, 1812–1823. doi: 10.1093/gbe/evw113

See Also

Call get_comptab with PS_TPPG17 or PS_LMM16 as a variable name to calculate mean differences of phylostrata for differentially expressed proteins.


# Get protein expression data for one dataset
pd <- pdat_colorectal("JKMF10")
opar <- par(mfrow = c(2, 1))
# Plot nH2O vs phylostrata with Trigos et al. data
get_comptab(pd, "PS_TPPG17", = TRUE)
# Plot nH2O vs phylostrata with Liebeskind et al. data
get_comptab(pd, "PS_LMM16", = TRUE)

# compare the two sources
PSdir <- system.file("extdata/phylostrata", package = "canprot")
TPPG17 <- read.csv(file.path(PSdir, "TPPG17.csv.xz"))
LMM16 <- read.csv(file.path(PSdir, "LMM16.csv.xz"))
IDs <- intersect(TPPG17$Entry, LMM16$UniProt)
PS_TPPG17 <- TPPG17$Phylostrata[match(IDs, TPPG17$Entry)]
PS_LMM16 <- LMM16$PS[match(IDs, LMM16$UniProt)]
plot(jitter(PS_TPPG17), jitter(PS_LMM16), pch = ".")

[Package canprot version 1.1.0 Index]