isopattern {enviPat} | R Documentation |
Isotope pattern calculation
Description
The function calculates the isotopologues ("isotope fine structure") of a given chemical formula or a set of chemical formulas (batch calculation) with fast and memory efficient transition tree algorithms, which can handle relative pruning thresholds. Returns accurate masses, probabilities and isotopic compositions of individual isotopologues. The isotopes of elements can be defined by the user.
Usage
isopattern(isotopes, chemforms, threshold = 0.001, charge = FALSE,
emass = 0.00054857990924, plotit = FALSE, algo=1, rel_to = 0, verbose = TRUE,
return_iso_calc_amount = FALSE)
Arguments
isotopes |
Dataframe listing all relevant isotopes, such as |
chemforms |
Vector with character strings of chemical formulas, such as data set |
threshold |
Probability below which isotope peaks can be omitted, as specified by argument |
charge |
z in m/z. Either a single integer or a vector of integers with length equal to that of argument |
emass |
Electrone mass; only relevant if |
plotit |
Should results be plotted, |
algo |
Which algorithm to use? Type |
rel_to |
Probability definition, numeric |
verbose |
Verbose, |
return_iso_calc_amount |
Ignore; number of intermediate isotopologues. |
Details
Isotope pattern calculation can be done by chosing one of two algorithms, set by argument algo
. Both algorithms use
transition tree updates to derive the exact mass and probability of a new isotopologue from existing ones, by steps of single isotope replacements.
These transition tree approaches are memory-efficient and fast for a wide range of molecular formulas and are able to reproduce the isotope fine structure
of molecules. The latter must often be pruned during calculation, c.p. argument rel_to
.
algo=1
grows transition trees within element-wise sub-molecules, whereas algo==2
grows them in larger sub-molecules of two elements, if available.
The latter approach can be slightly more efficient for very large or very complex molecules. The sub-isotopologues within sub-molecules are finally combined to
the isotopologuees of the full molecule. In contrast, intermediate counts of sub-isotopologues instead of fine structures are returned for return_iso_calc_amount==TRUE
rel_to
offers 5 possibilities of how probabilities are defined and pruned, each affecting the threshold
argument differently.
Default option rel_to=0
prunes and returns probabilities relative to the most intense isotope peak;
threshold
states a percentage of the intensity of this latter peak.
Similarly, option rel_to=1
normalizes relative to the peak consisting of the most abundant isotopes for each element, which
is often the monoisotopic one.
Option rel_to=2
prunes and returns absolute probabilities ; threshold
is not a percentage but an abolute cutoff.
Options rel_to=3
and rel_to=4
prune relative to the most intense and "monoisotopic" peak, respectively.
Although threshold
is a percentage, both options return absolut probabilities .
Value
List with length equal to length of vector chemforms
; names of entries in list = chemical formula in chemform.
Each entry in that list contains information on individual isotopologues (rows) with columns:
m/z |
First column; m/z of an isotope peak. |
abundance |
Second column; abundance of an isotope peak. Probabilities are set relative to the most abundant peak of the isotope pattern. |
12C , 13C , 1H , 2H , ... |
Third to all other columns; atom counts of individual isotopes for an isotope peak. |
warning
Too low values for threshold
may lead to unnecessary calculation of low probable isotope peaks - to the extent that not enough memory is available
for either of the two algorithms.
Note
It is highly recommended to check argument chemforms
with check_chemform
prior to running
isopattern
; argument chemforms
must conform to chemical formulas as defined in check_chemform
.
Element names must be followed by numbers (atom counts of that element), i.e. C1H4 is a valid argument whereas CH4 is not.
Otherwise, numbers may only be used in square brackets to denote individual isotopes defined in the element name column of iso_list, such as [14]C or [18]O.
For example, [13]C2C35H67N1O13 is the molecular formula of erythromycin labeled at two C-positions with [13]C;
C37H67N1O13 is the molecular formula of the unlabeled compound.
For correct adduct isotope pattern calculations, please check adducts
.
Author(s)
Martin Loos, Christian Gerber
References
Loos, M., Gerber, C., Corona, F., Hollender, J., Singer, H. (2015). Accelerated isotope fine structure calculation using pruned transition trees, Analytical Chemistry 87(11), 5738-5744.
https://pubs.acs.org/doi/abs/10.1021/acs.analchem.5b00941
https://www.envipat.eawag.ch/index.php
See Also
isopattern
chemforms
check_chemform
getR
envelope
vdetect
check_several
Examples
############################
# batch of chemforms #######
data(isotopes)
data(chemforms)
pattern<-isopattern(
isotopes,
chemforms,
threshold=0.1,
plotit=TRUE,
charge=FALSE,
emass=0.00054858,
algo=1
)
############################
# Single chemical formula ##
data(isotopes)
pattern<-isopattern(
isotopes,
"C100H200S2Cl5",
threshold=0.1,
plotit=TRUE,
charge=FALSE,
emass=0.00054858,
algo=1
)
############################