bulkSample {SITH}R Documentation

Simulate bulk sampling

Description

Simulate bulk sequencing data by taking a local sample from the tumor and computing the variant allele frequencies of the various mutations.

Usage

bulkSample(tumor, pos, cube.length = 5, threshold = 0.05, coverage = 0)

Arguments

tumor

A list which is the output of simulateTumor().

pos

The center point of the sample.

cube.length

The side length of the cube of cells to be sampled.

threshold

Only mutations with an allele frequency greater than the threshold will be included in the sample.

coverage

If nonzero then deep sequencing with specified coverage is performed.

Details

A local region of the tumor is sampled by constructing a cube with side length cube.length around the center point pos. Each cell within the cube is sampled, and the reported quantity is variant (or mutation) allele frequency. Lattice sites without cells are assumed to be normal tissue, and thus the reported MAF may be less than 1.0 even if the mutation is present in all cancerous cells.

If coverage is non-zero then deep sequencing can be simulated. For a chosen coverage C, it is known that the number of times the base is read follows a Pois(C) distribution (see Illumina's website). Let d be the true coverage sampled from this distribution. Then the estimated VAF is drawn from a Bin(d,p)/d distribution.

Note that cube.length is required to be an odd integer (in order to have a well-defined center point).

Value

A data frame with 1 row and columns corresponding to the mutations. The entries are the mutation allele frequency.

Author(s)

Phillip B. Nicol

References

K. Chkhaidze, T. Heide, B. Werner, M. Williams, W. Huang, G. Caravagna, T. Graham, and A. Sottoriva. Spatially con- strained tumour growth affects the patterns of clonal selection and neutral drift in cancer genomic data. PLOS Computational Biology, 2019. https://doi.org/10.1371/journal.pcbi.1007243. Lander ES, Waterman MS.(1988) Genomic mapping by fingerprinting random clones: a mathematical analysis, Genomics 2(3): 231-239.

Examples

set.seed(116776544, kind = "Mersenne-Twister", normal.kind = "Inversion")
out <- simulateTumor(max_pop = 1000)
df <- bulkSample(tumor = out, pos = c(0,0,0))


[Package SITH version 1.1.0 Index]