recomb_admix_inds {simfam} | R Documentation |
Reduce haplotype ancestry data to population ancestry dosage matrices
Description
This function accepts haplotype data, such as the output from recomb_haplo_inds()
with ret_anc = TRUE
(required), and reduces it to a list of population ancestry dosage matrices.
In this context, "ancestors/ancestry" refer to haplotype blocks from specific ancestor individuals, whereas "population ancestry" groups these ancestors into populations (such as African, European, etc.).
Although the haplotype data separates individuals and chromosomes into lists (the way it is simulated), the output matrices concatenates data from all chromosomes into a single matrix, as it appears in simpler simulations and real data, and matching the format of recomb_geno_inds()
.
Usage
recomb_admix_inds(haplos, anc_map, pops = sort(unique(anc_map$pop)))
Arguments
haplos |
A list of diploid individuals, each of which is a list with two haploid individuals named |
anc_map |
A data.frame or tibble with two columns: |
pops |
Optional order of populations in output, by default sorted alphabetically from |
Value
A named list of population ancestry dosage matrices, ordered as in pops
, each of which counts populations in both alleles (in 0, 1, 2), with individuals along columns in same order as haplos
list, and loci along rows in order of appearance concatenating chromosomes in numerical order.
See Also
recomb_fam()
for drawing recombination (ancestor) blocks, defined in terms of genetic distance.
recomb_map_inds()
for transforming genetic to basepair coordinates given a genetic map.
recomb_haplo_inds()
for determining haplotypes of descendants given ancestral haplotypes (creates input to this function).
Examples
# Lengthy code creates individuals with recombination data to map
# The smallest pedigree, two parents and a child (minimal fam table).
library(tibble)
fam <- tibble(
id = c('father', 'mother', 'child'),
pat = c(NA, NA, 'father'),
mat = c(NA, NA, 'mother')
)
# use latest human recombination map, but just first two chrs to keep this example fast
map <- recomb_map_hg38[ 1L:2L ]
# initialize parents with this other function
founders <- recomb_init_founders( c('father', 'mother'), map )
# draw recombination breaks for child
inds <- recomb_fam( founders, fam )
# now add base pair coordinates to recombination breaks
inds <- recomb_map_inds( inds, map )
# also need ancestral haplotypes
# these should be simulated carefully as needed, but for this example we make random data
haplo <- vector( 'list', length( map ) )
# names of ancestor haplotypes for this scenario
# (founders of fam$id but each with "_pat" and "_mat" suffixes)
anc_names <- c( 'father_pat', 'father_mat', 'mother_pat', 'mother_mat' )
n_ind <- length( anc_names )
# number of loci per chr, for toy test
m_loci <- 10L
for ( chr in 1L : length( map ) ) {
# draw random positions
pos_chr <- sample.int( max( map[[ chr ]]$pos ), m_loci )
# draw haplotypes
X_chr <- matrix(
rbinom( m_loci * n_ind, 1L, 0.5 ),
nrow = m_loci,
ncol = n_ind
)
# required column names!
colnames( X_chr ) <- anc_names
# add to structure, in a list
haplo[[ chr ]] <- list( X = X_chr, pos = pos_chr )
}
# determine haplotypes and per-position ancestries of descendants given ancestral haplotypes
haplos <- recomb_haplo_inds( inds, haplo, ret_anc = TRUE )
# define individual to population ancestry map
# take four ancestral haplotypes from above, assign them population labels
anc_map <- tibble(
anc = anc_names,
pop = c('African', 'European', 'African', 'African')
)
# finally, run desired function!
# convert haplotypes structure to list of population ancestry dosage matrices
Xs <- recomb_admix_inds( haplos, anc_map )