ts_f2 {slendr} | R Documentation |
Calculate the f2, f3, f4, and f4-ratio statistics
Description
These functions present an R interface to the corresponding f-statistics methods in tskit.
Usage
ts_f2(
ts,
A,
B,
mode = c("site", "branch", "node"),
span_normalise = TRUE,
windows = NULL
)
ts_f3(
ts,
A,
B,
C,
mode = c("site", "branch", "node"),
span_normalise = TRUE,
windows = NULL
)
ts_f4(
ts,
W,
X,
Y,
Z,
mode = c("site", "branch", "node"),
span_normalise = TRUE,
windows = NULL
)
ts_f4ratio(
ts,
X,
A,
B,
C,
O,
mode = c("site", "branch"),
span_normalise = TRUE
)
Arguments
ts |
Tree sequence object of the class |
mode |
The mode for the calculation ("sites" or "branch") |
span_normalise |
Divide the result by the span of the window? Default TRUE, see the tskit documentation for more detail. |
windows |
Coordinates of breakpoints between windows. The first
coordinate (0) and the last coordinate (equal to |
W , X , Y , Z , A , B , C , O |
Character vectors of individual names (largely following the nomenclature of Patterson 2021, but see crucial differences between tskit and ADMIXTOOLS in Details) |
Details
Note that the order of populations f3 statistic implemented in tskit (https://tskit.dev/tskit/docs/stable/python-api.html#tskit.TreeSequence.f3) is different from what you might expect from ADMIXTOOLS, as defined in Patterson 2012 (see https://academic.oup.com/genetics/article/192/3/1065/5935193 under heading "The three-population test and introduction of f-statistics", as well as ADMIXTOOLS documentation at https://github.com/DReichLab/AdmixTools/blob/master/README.3PopTest#L5). Specifically, the widely used notation introduced by Patterson assumes the population triplet as f3(C; A, B), with C being the "focal" sample (i.e., either the outgroup or a sample tested for admixture). In contrast, tskit implements f3(A; B, C), with the "focal sample" being A.
Although this is likely to confuse many ADMIXTOOLS users, slendr does not have
much choice in this, because its ts_*()
functions are designed to be
broadly compatible with raw tskit methods.
Value
Data frame with statistics calculated for the given sets of individuals
Examples
init_env()
# load an example model with an already simulated tree sequence
slendr_ts <- system.file("extdata/models/introgression_slim.trees", package = "slendr")
model <- read_model(path = system.file("extdata/models/introgression", package = "slendr"))
# load the tree-sequence object from disk and add mutations to it
ts <- ts_load(slendr_ts, model) %>% ts_mutate(mutation_rate = 1e-8, random_seed = 42)
# calculate f2 for two individuals in a previously loaded tree sequence
ts_f2(ts, A = "AFR_1", B = "EUR_1")
# calculate f2 for two sets of individuals
ts_f2(ts, A = c("AFR_1", "AFR_2"), B = c("EUR_1", "EUR_3"))
# calculate f3 for two individuals in a previously loaded tree sequence
ts_f3(ts, A = "EUR_1", B = "AFR_1", C = "NEA_1")
# calculate f3 for two sets of individuals
ts_f3(ts, A = c("AFR_1", "AFR_2", "EUR_1", "EUR_2"),
B = c("NEA_1", "NEA_2"),
C = "CH_1")
# calculate f4 for single individuals
ts_f4(ts, W = "EUR_1", X = "AFR_1", Y = "NEA_1", Z = "CH_1")
# calculate f4 for sets of individuals
ts_f4(ts, W = c("EUR_1", "EUR_2"),
X = c("AFR_1", "AFR_2"),
Y = "NEA_1",
Z = "CH_1")
# calculate f4-ratio for a given set of target individuals X
ts_f4ratio(ts, X = c("EUR_1", "EUR_2", "EUR_4", "EUR_5"),
A = "NEA_1", B = "NEA_2", C = "AFR_1", O = "CH_1")