remove_ambiguous {tidysq} | R Documentation |
Remove sequences that contain ambiguous elements
Description
This function replaces sequences with ambiguous elements by
empty (NULL
) sequences or removes ambiguous elements from sequences
in an sq
object.
Usage
remove_ambiguous(x, by_letter = FALSE, ...)
## S3 method for class 'sq'
remove_ambiguous(
x,
by_letter = FALSE,
...,
NA_letter = getOption("tidysq_NA_letter")
)
Arguments
x |
[ |
by_letter |
[ |
... |
further arguments to be passed from or to other methods. |
NA_letter |
[ |
Details
Biological sequences, whether of DNA, RNA or amino acid elements, are not always exactly determined. Sometimes the only information the user has about an element is that it's one of given set of possible elements. In this case the element is described with one of special letters, here called ambiguous.
The inclusion of these letters is the difference between extended and basic alphabets (and, conversely, types). For amino acid alphabet these letters are: B, J, O, U, X, Z; whereas for DNA and RNA: W, S, M, K, R, Y, B, D, H, V, N.
remove_ambiguous()
is used to create sequences without any of the
elements above. Depending on value of by_letter
argument, the function
either replaces "ambiguous" sequences with empty sequences (if
by_letter
is equal to TRUE
) or shortens original sequence by
retaining only unambiguous letters (if opposite is true).
Value
An sq
object with the _bsc
version of inputted type.
See Also
Functions that clean sequences:
is_empty_sq()
,
remove_na()
Examples
# Creating objects to work on:
sq_ami <- sq(c("MIAANYTWIL","TIAALGNIIYRAIE", "NYERTGHLI", "MAYXXXIALN"),
alphabet = "ami_ext")
sq_dna <- sq(c("ATGCAGGA", "GACCGAACGAN", "TGACGAGCTTA", "ACTNNAGCN"),
alphabet = "dna_ext")
# Removing whole sequences with ambiguous elements:
remove_ambiguous(sq_ami)
remove_ambiguous(sq_dna)
# Removing ambiguous elements from sequences:
remove_ambiguous(sq_ami, by_letter = TRUE)
remove_ambiguous(sq_dna, by_letter = TRUE)
# Analysis of the result
sq_clean <- remove_ambiguous(sq_ami)
is_empty_sq(sq_clean)
sq_type(sq_clean)