truncate_sequences.probability {LocaTT}R Documentation

Truncate DNA Sequences at Specified Probability that All Bases were Called Correctly

Description

Calculates the cumulative probability that all bases were called correctly along each DNA sequence and truncates the DNA sequence immediately prior to the first occurrence of a probability being equal to or less than a specified value.

Usage

truncate_sequences.probability(sequences, quality_scores, threshold = 0.5)

Arguments

sequences

A character vector of DNA sequences to truncate.

quality_scores

A character vector of DNA sequence quality scores encoded in Sanger format.

threshold

Numeric. The probability threshold used for truncation. The default is 0.5 (i.e., each trimmed sequence has a greater than 50% probability that all bases were called correctly).

Value

A list containing two elements. The first element is a character vector of truncated DNA sequences, and the second element is a character vector of quality scores which have been truncated to their corresponding truncated DNA sequences.

See Also

truncate_sequences.length for truncating DNA sequences to a specified length.
truncate_sequences.quality_score for truncating DNA sequences by Phred quality score.

Examples

truncate_sequences.probability(sequences=c("ATATAGCGCG","TGCCGATATA","ATCTATCACCGC"),
                               quality_scores=c("989!.C;F@\"","A((#-#;,2F","HD8I/+67=1>?"),
                               threshold=0.5)

[Package LocaTT version 1.1.1 Index]