seqkat {SeqKat} | R Documentation |
SeqKat
Description
Kataegis detection from SNV BED files
Usage
seqkat(sigcutoff = 5, mutdistance = 3.2, segnum = 4, ref.dir = NULL,
bed.file = "./", output.dir = "./", chromosome = "all",
chromosome.length.file = NULL, trinucleotide.count.file = NULL)
Arguments
sigcutoff |
The minimum hypermutation score used to classify the windows in the sliding binomial test as significant windows. The score is calculated per window as follows: -log10(binomial test p-value). Recommended value: 5 |
mutdistance |
The maximum intermutational distance allowed for SNVs to be grouped in the same kataegic event. Recommended value: 3.2 |
segnum |
Minimum mutation count. The minimum number of mutations required within a cluster to be identified as kataegic. Recommended value: 4 |
ref.dir |
Path to a directory containing the reference genome. Each chromosome should have its own .fa file and chromosomes X and Y are named as chr23 and chr24. The fasta files should contain no header |
bed.file |
Path to the SNV BED file. The BED file should contain the following information: Chromosome, Position, Reference allele, Alternate allele |
output.dir |
Path to a directory where output will be created. |
chromosome |
The chromosome to be analysed. This can be (1, 2, ..., 23, 24) or "all" to run sequentially on all chromosomes. |
chromosome.length.file |
A tab separated file containing the lengths of all chromosomes in the reference genome. |
trinucleotide.count.file |
A tab seprarated file containing a count of all trinucleotides present in the reference genome. This can be generated with the get.trinucleotide.counts() function in this package. |
Details
The default paramters in SeqKat have been optimized using Alexanrov's "Signatures of mutational processes in human cancer" dataset. SeqKat accepts a BED file and outputs the results in TXT format. A file per chromosome is generated if a kataegic event is detected, otherwise no file is generated. SeqKat reports two scores per kataegic event, a hypermutation score and an APOBEC mediated kataegic score.
Author(s)
Fouad Yousif
Fan Fan
Christopher Lalansingh
Examples
example.bed.file <- paste0(
path.package("SeqKat"),
"/extdata/test/PD4120a-chr4-1-2000000_test_snvs.bed"
);
example.ref.dir <- paste0(
path.package("SeqKat"),
"/extdata/test/ref/"
);
example.chromosome.length.file <- paste0(
path.package("SeqKat"),
"/extdata/test/length_hg19_chr_test.txt"
);
seqkat(
5,
3.2,
2,
bed.file = example.bed.file,
output.dir = tempdir(),
chromosome = "4",
ref.dir = example.ref.dir,
chromosome.length.file = example.chromosome.length.file
);