VCFsToIDCatalogs {ICAMS}R Documentation

Create ID (small insertion and deletion) catalog from ID VCFs

Description

Create ID (small insertion and deletion) catalog from ID VCFs

Usage

VCFsToIDCatalogs(
  list.of.vcfs,
  ref.genome,
  num.of.cores = 1,
  region = "unknown",
  flag.mismatches = 0,
  return.annotated.vcfs = FALSE,
  suppress.discarded.variants.warnings = TRUE
)

Arguments

list.of.vcfs

List of in-memory ID VCFs. The list names will be the sample ids in the output catalog.

ref.genome

A ref.genome argument as described in ICAMS.

num.of.cores

The number of cores to use. Not available on Windows unless num.of.cores = 1.

region

A character string acting as a region identifier, one of "genome", "exome".

flag.mismatches

Deprecated. If there are ID variants whose REF do not match the extracted sequence from ref.genome, the function will automatically discard these variants and an element discarded.variants will appear in the return value. See AnnotateIDVCF for more details.

return.annotated.vcfs

Logical. Whether to return the annotated VCFs with additional columns showing mutation class for each variant. Default is FALSE.

suppress.discarded.variants.warnings

Logical. Whether to suppress warning messages showing information about the discarded variants. Default is TRUE.

Value

A list of elements:

Note

In ID (small insertion and deletion) catalogs, deletion repeat sizes range from 0 to 5+, but for plotting and end-user documentation deletion repeat sizes range from 1 to 6+.

ID classification

See https://github.com/steverozen/ICAMS/blob/master/data-raw/PCAWG7_indel_classification_2021_09_03.xlsx for additional information on ID (small insertion and deletion) mutation classification.

See the documentation for Canonicalize1Del which first handles deletions in homopolymers, then handles deletions in simple repeats with longer repeat units, (e.g. CACACACA, see FindMaxRepeatDel), and if the deletion is not in a simple repeat, looks for microhomology (see FindDelMH).

See the code for unexported function CanonicalizeID and the functions it calls for handling of insertions.

Examples

file <- c(system.file("extdata/Strelka-ID-vcf/",
                      "Strelka.ID.GRCh37.s1.vcf",
                      package = "ICAMS"))
list.of.ID.vcfs <- ReadStrelkaIDVCFs(file)
if (requireNamespace("BSgenome.Hsapiens.1000genomes.hs37d5",
 quietly = TRUE)) {
  catID <- VCFsToIDCatalogs(list.of.ID.vcfs, ref.genome = "hg19",
                            region = "genome")}

[Package ICAMS version 2.3.12 Index]