extractBarcodes {genBaRcode} | R Documentation |
Barcode extraction
Description
Extracts barcodes according to the given barcode design from a fastq file.
Usage
extractBarcodes(
dat,
label,
results_dir = "./",
mismatch = 0,
indels = FALSE,
bc_backbone,
full_output = FALSE,
cpus = 1,
strategy = "sequential",
wobble_extraction = TRUE,
dist_measure = "hamming"
)
Arguments
dat |
a ShortReadQ object. |
label |
a character string. |
results_dir |
a character string which contains the path to the results directory. |
mismatch |
an positive integer value, default is 0, if greater values are provided they indicate the number of allowed mismatches when identifing the barcode constructe. |
indels |
under construction. |
bc_backbone |
a character string or character vector describing the barcode design, variable positions have to be marked with the letter 'N'. |
full_output |
a logical value. If TRUE additional output files will be generated in order to identify errors. |
cpus |
an integer value, indicating the number of available cpus. |
strategy |
since the future package is used for parallelisation a strategy has to be stated, the default is "sequential" (cpus = 1) and "multiprocess" (cpus > 1). For further information please read future::plan() R-Documentation. |
wobble_extraction |
a logical value. If TRUE, single reads will be stripped of the backbone and only the "wobble" positions will be left. |
dist_measure |
a character value. If "bc_backbone = 'none'", single reads will be clustered based on a distance measure. Available distance methods are Optimal string aligment ("osa"), Levenshtein ("lv"), Damerau-Levenshtein ("dl"), Hamming ("hamming"), Longest common substring ("lcs"), q-gram ("qgram"), cosine ("cosine"), Jaccard ("jaccard"), Jaro-Winkler ("jw"), distance based on soundex encoding ("soundex"). For more detailed information see stringdist function of the stringdist-package for more information) |
Value
one or a list of frequency table(s) of barcode sequences.
Examples
## Not run:
bc_backbone <- "ACTNNCGANNCTTNNCGANNCTTNNGGANNCTANNACTNNCGANNCTTNNCGANNCTTNNGGANNCTANNACTNNCGANN"
source_dir <- system.file("extdata", package = "genBaRcode")
dat <- ShortRead::readFastq(dirPath = source_dir, pattern = "test_data.fastq.gz")
extractBarcodes(dat, label = "test", results_dir = getwd(), mismatch = 0,
indels = FALSE, bc_backbone)
## End(Not run)