substitute_letters {tidysq} | R Documentation |
Substitute letters in a sequence
Description
Replaces all occurrences of a letter with another.
Usage
substitute_letters(x, encoding, ...)
## S3 method for class 'sq'
substitute_letters(x, encoding, ..., NA_letter = getOption("tidysq_NA_letter"))
Arguments
x |
[ |
encoding |
[ |
... |
further arguments to be passed from or to other methods. |
NA_letter |
[ |
Details
substitute_letters
allows to replace unwanted letters in any sequence
with user-defined or IUPAC symbols. Letters can also be replaced with
NA
values, so that they can be later removed from the sequence
by remove_na
function.
It doesn't matter whether replaced or replacing letter is single or multiple character. However, the user cannot replace multiple letters with one nor one letter with more than one.
Of course, multiple different letters can be encoded to the same symbol, so
c(A = "rep1", H = "rep1", G = "rep1")
is allowed, but
c(AHG = "rep1")
is not (unless there is a letter "AHG
" in
the alphabet). By doing that any information of separateness of original
letters is lost, so it isn't possible to retrieve original sequence after
this operation.
All encoding names must be letters contained within the alphabet, otherwise an error will be thrown.
Value
An sq
object of atp type with
updated alphabet.
See Also
Functions that manipulate type of sequences:
find_invalid_letters()
,
is.sq()
,
sq_type()
,
typify()
Examples
# Creating objects to work on:
sq_dna <- sq(c("ATGCAGGA", "GACCGAACGAN", "TGACGAGCTTA", "ACTNNAGCN"),
alphabet = "dna_ext")
sq_ami <- sq(c("MIOONYTWIL","TIOOLGNIIYROIE", "NYERTGHLI", "MOYXXXIOLN"),
alphabet = "ami_ext")
sq_atp <- sq(c("mALPVQAmAmA", "mAmAPQ"), alphabet = c("mA", LETTERS))
# Not all letters must have their encoding specified:
substitute_letters(sq_dna, c(T = "t", A = "a", C = "c", G = "g"))
substitute_letters(sq_ami, c(M = "X"))
# Multiple character letters are supported in encodings:
substitute_letters(sq_atp, c(mA = "-"))
substitute_letters(sq_ami, c(I = "ough", O = "eau"))
# Numeric substitutions are allowed too, these are coerced to characters:
substitute_letters(sq_dna, c(N = 9, G = 7))
# It's possible to replace a letter with NA value:
substitute_letters(sq_ami, c(X = NA_character_))