R: Quality filtering for amplicon sequences.

qfilter {insect}

R Documentation

Quality filtering for amplicon sequences.

Description

This function performs several quality checks for FASTQ input files, removing any sequences that do not conform to the specified quality standards. This includes an average quality score assessment, size selection, singleton removal (or an alternative minimum count) and ambiguous base-call filtering.

Usage

qfilter(
  x,
  minqual = 30,
  maxambigs = 0,
  mincount = 2,
  minlength = 50,
  maxlength = 500
)

Arguments

`x`	a vector of concatenated strings representing DNA sequences (in upper case) or a DNAbin list object with quality attributes. This argument will usually be produced by `readFASTQ`.
`minqual`	integer, the minimum average quality score for a sequence to pass the filter. Defaults to 30.
`maxambigs`	integer, the maximum number of ambiguities for a sequence to pass the filter. Defaults to 0.
`mincount`	integer, the minimum acceptable number of occurrences of a sequence for it to pass the filter. Defaults to 2 (removes singletons).
`minlength`	integer, the minimum acceptable sequence length. Defaults to 50.
`maxlength`	integer, the maximum acceptable sequence length. Defaults to 500.

Value

an object of the same type as the primary input argument (i.e. a "DNAbin" object if x is a "DNAbin" object, or a vector of concatenated character strings otherwise).

Author(s)

Shaun Wilkinson

Examples


  ## download and extract example FASTQ file to temporary directory
  td <- tempdir()
  URL <- "https://www.dropbox.com/s/71ixehy8e51etdd/insect_tutorial1_files.zip?dl=1"
  dest <- paste0(td, "/insect_tutorial1_files.zip")
  download.file(URL, destfile = dest, mode = "wb")
  unzip(dest, exdir = td)
  x <- readFASTQ(paste0(td, "/COI_sample2.fastq"))
  ## trim primers from sequences
  mlCOIintF <- "GGWACWGGWTGAACWGTWTAYCCYCC"
  jgHCO2198 <- "TAIACYTCIGGRTGICCRAARAAYCA"
  x <- trim(x, up = mlCOIintF, down = jgHCO2198)
  ## filter sequences to remove singletons, low quality & short/long reads
  x <- qfilter(x, minlength = 250, maxlength = 350)

[Package insect version 1.4.2 Index]