R: Expansion of IUPAC nucleotide symbols

amb {seqinr}

R Documentation

Expansion of IUPAC nucleotide symbols

This function returns the list of nucleotide matching a given IUPAC nucleotide symbol, for instance c("c", "g") for "s".

amb(base, forceToLower = TRUE, checkBase = TRUE,
IUPAC = s2c("acgturymkswbdhvn"), u2t = TRUE)

`base`	an IUPAC symbol for a nucleotide as a single character
`forceToLower`	if TRUE the base is forced to lower case
`checkBase`	if TRUE the character is checked to belong to the allowed IUPAC symbol list
`IUPAC`	the list of allowed IUPAC symbols
`u2t`	if TRUE "u" for uracil in RNA are changed into "t" for thymine in DNA

Non ambiguous bases are returned unchanged (except for "u" when u2t is TRUE).

When base is missing, the list of IUPAC symbols is returned, otherwise a vector with expanded symbols.

J.R. Lobry

The nomenclature for incompletely specified bases in nucleic acid sequences at: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC341218/

citation("seqinr")

#
# The list of IUPAC symbols:
#

amb()

#
# And their expansion:
#

sapply(amb(), amb)

[Package seqinr version 4.2-36 Index]