get_comptab {canprot}R Documentation

Calculate Compositional Differences


Compute differences of carbon oxidation state, stoichiometric hydration state and other compositional metrics between groups of up- and down-regulated proteins.


  get_comptab(pdat, var1 = "ZC", var2 = "nH2O", = FALSE,
    mfun = "median", oldstyle = FALSE, basis = getOption("basis"))



list, data object generated by a pdat_ function


character, the first variable


character, the second variable

logical, make a scatterplot?


character, either median or mean


logical, also calculate CLES and p-values?


character, keyword for basis species to use


The available variables are:

ZC average oxidation state of carbon (ZC; see ZCAA)
nH2O stoichiometric hydration state per residue (nH2O; see H2OAA)
nO2 stoichiometric oxidation state per residue (nO2; see O2AA)
V0 standard molal volume per residue
nAA protein length (number of amino acids)
GRAVY grand average of hydropathicity (see GRAVY)
pI isoelectric point (see pI)
PS_TPPG17 phylostratum (see PS)
PS_LMM16 phylostratum (see PS)
MW molecular weight per residue

Differentially expressed proteins are identified by the value of pdat$up2 (TRUE for up-regulated proteins and FALSE for down-regulated proteins). The differences are calculated as (median for up-regulated proteins) - (median for down-regulated proteins); if mfun is mean, means of the groups are used instead. If oldstyle is TRUE, the function also calculates the common language effect size (CLES, in percent) and p-value for each variable.

The basis argument is used to select the basis species, which are used for the calculation of nH2O and nO2. The default for getOption("basis") is to use the QEC basis species (see metrics).

Volume is calculated using amino acid group additivity as described by Dick et al. (2006).

Phylostrata are not compositional metrics, but are retrieved by matching UniProt accession numbers in a data file (see PS). Because phylostratum numbers are discrete values, mean values are calculated regardless of the value of mfun.

Set to TRUE to make a scatterplot. Open red squares and filled blue circles stand for up-regulated and down-regulated proteins, respectively.


A data frame is returned invisibly containing the columns dataset, description, n1 (number of down-regulated proteins), n2 (number of up-regulated proteins), followed two sets of columns for the variables. These are denoted generically as (var.mfun1, var.mfun2, var.diff, var.CLES, var.p.value), where var is replaced by the name of var1 or var2, and mfun is replaced by the value of mfun. For example, ZC.median1 and ZC.median2 are the median ZC of the down- and up-regulated proteins, respectively.


Dick, J. M., LaRowe, D. E. and Helgeson, H. C. (2006) Temperature, pressure, and electrochemical constraints on protein speciation: Group additivity calculation of the standard molal thermodynamic properties of ionized unfolded proteins. Biogeosciences 3, 311–336.


pd <- pdat_colorectal("JKMF10")
# default variables: ZC and nH2O
get_comptab(pd, = TRUE)
# protein length and per-residue volume
get_comptab(pd, "nAA", "V0", = TRUE)

[Package canprot version 1.1.0 Index]