| slideWindowDb {shazam} | R Documentation |
Sliding window approach towards filtering sequences in a data.frame
Description
slideWindowDb determines whether each input sequence in a data.frame
contains equal to or more than a given number of mutations in a given length of
consecutive nucleotides (a "window") when compared to their respective germline
sequence.
Usage
slideWindowDb(
db,
sequenceColumn = "sequence_alignment",
germlineColumn = "germline_alignment_d_mask",
mutThresh = 6,
windowSize = 10,
nproc = 1
)
Arguments
db |
|
sequenceColumn |
name of the column containing IMGT-gapped sample sequences. |
germlineColumn |
name of the column containing IMGT-gapped germline sequences. |
mutThresh |
threshold on the number of mutations in |
windowSize |
length of consecutive nucleotides. Must be at least 2. |
nproc |
Number of cores to distribute the operation over. If the
|
Value
a logical vector. The length of the vector matches the number of input sequences in
db. Each entry in the vector indicates whether the corresponding input sequence
should be filtered based on the given parameters.
See Also
See slideWindowSeq for applying the sliding window approach on a single sequence.
See slideWindowTune for parameter tuning for mutThresh and windowSize.
Examples
# Use an entry in the example data for input and germline sequence
data(ExampleDb, package="alakazam")
# Apply the sliding window approach on a subset of ExampleDb
slideWindowDb(db=ExampleDb[1:10, ], sequenceColumn="sequence_alignment",
germlineColumn="germline_alignment_d_mask",
mutThresh=6, windowSize=10, nproc=1)