bulkSample {SITH} | R Documentation |
Simulate bulk sampling
Description
Simulate bulk sequencing data by taking a local sample from the tumor and computing the variant allele frequencies of the various mutations.
Usage
bulkSample(tumor, pos, cube.length = 5, threshold = 0.05, coverage = 0)
Arguments
tumor |
A list which is the output of |
pos |
The center point of the sample. |
cube.length |
The side length of the cube of cells to be sampled. |
threshold |
Only mutations with an allele frequency greater than the threshold will be included in the sample. |
coverage |
If nonzero then deep sequencing with specified coverage is performed. |
Details
A local region of the tumor is sampled by constructing a cube with side length cube.length
around
the center point pos
. Each cell within the cube is sampled, and the reported quantity is variant (or mutation)
allele frequency. Lattice sites without cells are assumed to be normal tissue, and thus the reported MAF may be less than
1.0 even if the mutation is present in all cancerous cells.
If coverage
is non-zero then deep sequencing can be simulated. For a chosen coverage C
, it is known
that the number of times the base is read follows a Pois(C)
distribution (see Illumina's website).
Let d
be the true coverage
sampled from this distribution. Then the estimated VAF is drawn from a Bin(d,p)/d
distribution.
Note that cube.length
is required to be an odd integer (in order to have a well-defined center point).
Value
A data frame with 1 row and columns corresponding to the mutations. The entries are the mutation allele frequency.
Author(s)
Phillip B. Nicol
References
K. Chkhaidze, T. Heide, B. Werner, M. Williams, W. Huang, G. Caravagna, T. Graham, and A. Sottoriva. Spatially con- strained tumour growth affects the patterns of clonal selection and neutral drift in cancer genomic data. PLOS Computational Biology, 2019. https://doi.org/10.1371/journal.pcbi.1007243. Lander ES, Waterman MS.(1988) Genomic mapping by fingerprinting random clones: a mathematical analysis, Genomics 2(3): 231-239.
Examples
set.seed(116776544, kind = "Mersenne-Twister", normal.kind = "Inversion")
out <- simulateTumor(max_pop = 1000)
df <- bulkSample(tumor = out, pos = c(0,0,0))