vcfcomp {vcfppR}R Documentation

Compare two VCF/BCF files reporting various statistics

Description

Compare two VCF/BCF files reporting various statistics

Usage

vcfcomp(
  test,
  truth,
  stats = "all",
  formats = c("DS", "GT"),
  by.sample = FALSE,
  by.variant = FALSE,
  flip = FALSE,
  names = NULL,
  bins = NULL,
  af = NULL,
  out = NULL,
  ...
)

Arguments

test

path to the first VCF/BCF file referred as test.

truth

path to the second VCF/BCF file referred as truth, or saved RDS file.

stats

the statistics to be calculated. supports the following. "r2": pearson correlation coefficient ** 2. "f1": F1-score, good balance between sensitivity and precision. "nrc": Non-Reference Concordance rate

formats

character vector. the FORMAT tags to extract for the test and truth respectively. default c("DS", "GT") extracts 'DS' of the target and 'GT' of the truth.

by.sample

logical. calculate concordance for each samples, then average by bins.

by.variant

logical. calculate concordance for each variant, then average by bins. if both bysample and by variant are TRUE, then do average on all samples first. if both bysample and by variant are FALSE, then do average on all samples and variants.

flip

logical. flip the ref and alt variants

names

character vector. reset samples' names in the test VCF.

bins

numeric vector. break statistics into allele frequency bins.

af

file path to allele frequency text file or saved RDS file.

out

output prefix for saving objects into RDS file

...

options passed to vcftable

Details

vcfcomp implements various statisitcs to compare two VCF/BCF files, e.g. report genotype concocrdance, correlation stratified by allele frequency.

Value

a list of various statistics

Author(s)

Zilong Li zilong.dk@gmail.com

Examples

library('vcfppR')
test <- system.file("extdata", "imputed.gt.vcf.gz", package="vcfppR")
truth <- system.file("extdata", "imputed.gt.vcf.gz", package="vcfppR")
samples <- "HG00133,HG00143,HG00262"
res <- vcfcomp(test, truth, stats="f1", format=c('GT','GT'), samples=samples)
str(res)

[Package vcfppR version 0.4.5 Index]