Canonicalize1Del {ICAMS} | R Documentation |
Given a deletion and its sequence context, categorize it
Description
This function is primarily for internal use, but we export it to document the underlying logic.
Usage
Canonicalize1Del(context, del.seq, pos, trace = 0)
Arguments
context |
The deleted sequence plus ample surrounding
sequence on each side (at least as long as |
del.seq |
The deleted sequence in |
pos |
The position of |
trace |
If > 0, then generate messages tracing how the computation is carried out. |
Details
See https://github.com/steverozen/ICAMS/blob/master/data-raw/PCAWG7_indel_classification_2021_09_03.xlsx for additional information on deletion mutation classification.
This function first handles deletions in homopolymers, then
handles deletions in simple repeats with
longer repeat units, (e.g. CACACACA
, see
FindMaxRepeatDel
),
and if the deletion is not in a simple repeat,
looks for microhomology (see FindDelMH
).
See the code for unexported function CanonicalizeID
and the functions it calls for handling of insertions.
Value
A string that is the canonical representation
of the given deletion type. Return NA
and raise a warning if
there is an un-normalized representation of
the deletion of a repeat unit.
See FindDelMH
for details.
(This seems to be very rare.)
Examples
Canonicalize1Del("xyAAAqr", del.seq = "A", pos = 3) # "DEL:T:1:2"
Canonicalize1Del("xyAAAqr", del.seq = "A", pos = 4) # "DEL:T:1:2"
Canonicalize1Del("xyAqr", del.seq = "A", pos = 3) # "DEL:T:1:0"