synthesis {RScelestial} | R Documentation |
Synthesize single-cell data through tumor simulation
Description
This function simulates a evolution in a tumor through two phases: 1) simulation of evolution, 2) sampling.
Usage
synthesis(
sample,
site,
evolution.step,
mutation.rate = 1,
advantage.increase.ratio = 1,
advantage.decrease.ratio = 10,
advantage.keep.ratio = 100,
advantage.increase.step = 0.01,
advantage.decrease.step = 0.01,
mv.rate = 0.5,
fp.rate = 0.2,
fn.rate = 0.1,
seed = -1
)
Arguments
sample |
Number of samples. |
site |
number of sites (loci) |
evolution.step |
Number of evolutionary steps in the process of production of the evolutionary tree. |
mutation.rate |
The rate of mutation on each evolutionary step in evolutionary tree synthesis. |
advantage.increase.ratio , advantage.decrease.ratio , advantage.keep.ratio |
A child node
in the evolutionary tree is chosen for increase/decrease/keep its parent advantage with
probabilities proportional to |
advantage.increase.step , advantage.decrease.step |
The amount of increasing or decreasing the advantage of a cell relative to its parent. |
mv.rate |
Rate of missing value to be added to the resulting sequences. |
fp.rate , fn.rate |
Rate of false positive (0 -> 1) and false negative (1 -> 0) in the sequences. |
seed |
The seed for randomization. |
Details
The simulation of evolution starts with a single cell.
Then for evolution.step
steps, on each step a cell is selected for duplication.
A new cell as its child is added to
the evolutionary tree. To each node in the evolutionary tree an advantage is assigned
representing its relative advantage in replication and in being sampled. Advantage of a node
is calculated by increasing (decreasing) its parents advantage by advantage.increase.step
(advantage.decrease.step
) with probability proportional to advantage.increase.ratio
(advantage.decrease.ratio
).
With a probability proportional to advantage.keep.ratio
the advantage of a node
is equal to its parent's advantage.
Sequences for each node is build based on its parent's sequence by adding some mutations.
Mutations are added for each locus independently with rate mutation.rate
.
In the sampling phase, sample
cells are selected from the evolutionary tree nodes.
Result of the sequencing process for a cell is determined by the sequence of the node in the evolutionary tree
with addition of some random errors. Errors are result of applying some false positives with rate fp.rate
,
applying some false negatives with rate fn.rate
, and adding some missing values
with rate mv.rate
.
Value
The function returns a list. The list consists of
-
sequence
: A data frame representing result of sequencing. The data frame has a row for each locus and a column for each sample. -
true.sequence
: The actual sequence for the sample before adding errors and missing values. -
true.clone
: A list that stores index of sampled cells for each node in the evolutionary tree. -
true.tree
: The evolutionary tree that the samples are sampled from. It is a data frame withsrc
,dest
, andlen
columns representing source, destination and weight of edges of the tree, respectively.
Examples
## generating a data set with 10 samples and 5 loci through simulation of
## 20-step evolution.
synthesis(10, 5, 20, seed=7)
## The result is
# $seqeunce
# C1 C2 C3 C4 C5
# L1 1 1 1 1 1
# L2 3 1 3 3 0
# L3 3 1 3 3 1
# L4 3 0 1 0 0
# L5 1 3 0 3 3
# L6 3 1 3 1 0
# L7 3 3 1 0 3
# L8 3 1 1 3 3
# L9 3 3 1 3 1
# L10 0 3 0 3 0
#
# $true.sequence
# C1 C2 C3 C4 C5
# L1 0 1 1 1 1
# L2 0 1 0 0 1
# L3 0 1 0 0 1
# L4 0 1 1 1 1
# L5 1 1 0 1 0
# L6 0 1 0 1 0
# L7 0 1 0 0 1
# L8 0 1 1 1 1
# L9 0 1 1 1 1
# L10 0 0 0 0 0
#
# $true.clone
# $true.clone[[1]]
# [1] 4
#
# $true.clone[[2]]
# [1] 1
#
# $true.clone[[3]]
# [1] 6
#
# $true.clone[[4]]
# [1] 10
#
# $true.clone[[5]]
# [1] 2
#
# $true.clone[[6]]
# [1] 3
#
# $true.clone[[7]]
# [1] 8 9
#
# $true.clone[[8]]
# [1] 7
#
# $true.clone[[9]]
# [1] 5
#
#
# $true.tree
# src dest len
# 1 1 5 3
# 2 5 7 1
# 3 5 10 2
# 4 1 11 3
# 5 1 12 2
# 6 1 13 3
# 7 7 14 2
# 8 12 19 1
# 9 10 20 1
#