segregate {gscramble} | R Documentation |
Segregate segments down genomic simulation pedigrees
Description
Given a collection of genomic simulation pedigrees and requests
for how many simulations should be done (in the request
input),
as well as recombination rates, this simulates the segregation
of segments down through the pedigrees
Usage
segregate(request, RR, MM = NULL)
Arguments
request |
a tibble with list columns "gpp" and "reppop".
Each element of the gpp column is a tibble giving a genomic simulation
pedigree as documented as the input for |
RR |
the recombination rates in the format of the package data |
MM |
the marker meta data tibble (like M_meta). If this is NULL (the default) that
is fine. If not, then it uses the order of the markers in MM to define
the levels of a chrom_f column so that we can sort the rows of the output
correctly, with respect to markers in the Genotype data frame. This will
let us more efficiently subscript the markers out of the matrix. If MM is
not present, then the function will create |
Value
The output from this function is a tibble. Each row represents one segment of genetic
material amongst the sampled individuals from the genomic permutation pedigrees. The columns give
information about the provenance and destination of that segment as follows.
Each segment exists in one of the samples (samp_index
) from a sampled individual
with a ped_sample_id
on a given gpp
(the index giving the row of the request input tibble)
in a given index
within the individual. Further, it is on one of two gametes
(gamete_index
) that segregated into the individual, and it came from a certain founding
population (pop_origin
) that corresponds to the named groups in the genotype file (group_origin
).
And, of course, the segment occupies the space from start
to end
on a chromosome chrom
.
Finally, the index of the founder haplotype on the given gpp that this segement descended from is
given in rs_founder_haplotype
which is short for "rep-specific founder haplotype". This final
piece of information is crucial for segregating variation from the individuals in the Geno
file
onto these segments. Finally, the column sim_level_founder_haplo
assigns a unique index for each founder
haplotype. This is necessary because any simulation can involve multiple gpps and/or indexes of gpps,
and the founders in each of those must all be unique within a simulation. so that those haplotypes
can all, eventually, be accessed easily out of the genotype matrix.
Examples
# We construct an example here where we will request segregation
# down a GSP with two F1s and F1B backcrosses between two hypothetical
# populations, A and B.
gsp_f1f1b <- create_GSP("A", "B", F1 = TRUE, F1B = TRUE)
# We will imagine that in our marker data there are three groups
# labelled "grp1", "grp2", and "grp3", and we want to create the F1Bs with backcrossing
# only to grp3.
reppop <- tibble::tibble(
index = as.integer(c(1, 1, 2, 2)),
pop = c("A", "B", "A", "B"),
group = c("grp3", "grp1", "grp3", "grp2")
)
# combine those into a request
request <- tibble::tibble(
gpp = list(gsp_f1f1b),
reppop = list(reppop)
)
result1 <- segregate(request, RecRates)
# here we pass it some markers, too
result2 <- segregate(request, RecRates, M_meta)
result1
result2