adjust {debar} | R Documentation |
Adjust the sequences based on the nt path outputs.
Description
Based on the PHMM path generated by the frame function, the sequence is the adjusted. Adjustments are limited to the 657bp region represeneted by the PHMM (and the part of the input sequence matching this region). Censorship can be applied around the corrections. This limits the number of indel errors missed by the PHMM correction algorithm, but comes at a cost of lost DNA sequence. The default of 7 is a conservative paramater meant to lead to the coverage of greater than 95
Usage
adjust(x, ...)
## S3 method for class 'DNAseq'
adjust(x, ..., censor_length = 7, added_phred = "*")
Arguments
x |
a DNAseq class object. |
... |
additional arguments to be passed between methods. |
censor_length |
the number of base pairs in either direction of a PHMM correction to convert to placeholder characters. Default is 7. |
added_phred |
The phred character to use for characters inserted into the original sequence. Default is "*". |
Details
If the DNAseq object contains PHRED scores, the PHRED string will be adjusted along with the DNA sequence (corresponding) value removed when a bp removed. The 'added_phred' value indicated the phred chracter to be added to the string when a placeholder nucleotide is added to the string to account for a deletion. Default is "*" which indicates a score of 9 (a relitavely low quality base).
Value
a class object of code "ccs_reads"
See Also
Examples
#previously called
ex_data = DNAseq(example_nt_string_errors, name = 'error_adj_example')
ex_data = frame(ex_data)
#adjust the sequence with default censor length is 7
ex_data = adjust(ex_data)
ex_data$adjusted_sequence #output is a vector, use outseq to build the string
#with a custom censorship size
ex_data = adjust(ex_data, censor_length = 5)
ex_data$adjusted_sequence #less flanking base pairs turned to placeholders
ex_data$adjustment_count #get a count of the number of adjustments applied