draw_all_admix {bnpsd} | R Documentation |
Simulate random allele frequencies and genotypes from the BN-PSD admixture model
Description
This function returns simulated ancestral, intermediate, and individual-specific allele frequencies and genotypes given the admixture structure, as determined by the admixture proportions and the vector or tree of intermediate subpopulation FST values.
The function is a wrapper around draw_p_anc()
, draw_p_subpops()
/draw_p_subpops_tree()
, make_p_ind_admix()
, and draw_genotypes_admix()
with additional features such as requiring polymorphic loci.
Importantly, by default fixed loci (where all individuals were homozygous for the same allele) are re-drawn from the start (starting from the ancestral allele frequencies) so no fixed loci are in the output and no biases are introduced by re-drawing genotypes conditional on any of the previous allele frequencies (ancestral, intermediate, or individual-specific).
Below m_loci
(also m
) is the number of loci, n
is the number of individuals, and k
is the number of intermediate subpopulations.
Usage
draw_all_admix(
admix_proportions,
inbr_subpops = NULL,
m_loci,
tree_subpops = NULL,
want_genotypes = TRUE,
want_p_ind = FALSE,
want_p_subpops = FALSE,
want_p_anc = TRUE,
verbose = FALSE,
require_polymorphic_loci = TRUE,
maf_min = 0,
beta = NA,
p_anc = NULL,
p_anc_distr = NULL
)
Arguments
admix_proportions |
The |
inbr_subpops |
The length- |
m_loci |
The number of loci to draw. |
tree_subpops |
The coancestry tree relating the |
want_genotypes |
If |
want_p_ind |
If |
want_p_subpops |
If |
want_p_anc |
If |
verbose |
If |
require_polymorphic_loci |
If |
maf_min |
The minimum minor allele frequency (default zero), to extend the working definition of "fixed" above to include rare variants.
This helps simulate a frequency-based locus ascertainment bias.
Loci with minor allele frequencies less than or equal to this value are treated as fixed (passed to |
beta |
Shape parameter for a symmetric Beta for ancestral allele frequencies |
p_anc |
If provided, it is used as the ancestral allele frequencies (instead of drawing random ones). Must either be a scalar or a length- |
p_anc_distr |
If provided, ancestral allele frequencies are drawn with replacement from this vector (which may have any length) or function, instead of from |
Details
As a precaution, function stops if both the column names of admix_proportions
and the names in inbr_subpops
or tree_subpops
exist and disagree, which might be because these two data are not aligned or there is some other inconsistency.
Value
A named list with the following items (which may be missing depending on options):
-
X
: Anm
-by-n
matrix of genotypes. Included ifwant_genotypes = TRUE
. -
p_anc
: A length-m
vector of ancestral allele frequencies. Included ifwant_p_anc = TRUE
. -
p_subpops
: Anm
-by-k
matrix of intermediate subpopulation allele frequencies Included ifwant_p_subpops = TRUE
. -
p_ind
: Anm
-by-n
matrix of individual-specific allele frequencies. Included ifwant_p_ind = TRUE
.
Examples
# dimensions
# number of loci
m_loci <- 10
# number of individuals
n_ind <- 5
# number of intermediate subpops
k_subpops <- 2
# define population structure
# FST values for k = 2 subpopulations
inbr_subpops <- c(0.1, 0.3)
# admixture proportions from 1D geography
admix_proportions <- admix_prop_1d_linear(n_ind, k_subpops, sigma = 1)
# draw all random allele freqs and genotypes
out <- draw_all_admix(admix_proportions, inbr_subpops, m_loci)
# return value is a list with these items:
# genotypes
X <- out$X
# ancestral AFs
p_anc <- out$p_anc
# # these are excluded by default, but would be included if ...
# # ... `want_p_subpops == TRUE`
# # intermediate subpopulation AFs
# p_subpops <- out$p_subpops
#
# # ... `want_p_ind == TRUE`
# # individual-specific AFs
# p_ind <- out$p_ind