geno_last_gen {simfam} | R Documentation |
Draw random genotypes for last generation of a pedigree with known founder genotypes
Description
A wrapper around the more general geno_fam()
, specialized to save memory when only the last generation is desired (geno_fam()
returns genotypes for the entire pedigree in a single matrix).
This function assumes that generations are non-overlapping (met by the output of sim_pedigree()
), in which case each generation g
can be drawn from generation g-1
data only.
That way, only two consecutive generations need be in memory at any given time.
The partitioning of individuals into generations is given by the ids
parameter (again matches the output of sim_pedigree()
).
Usage
geno_last_gen(X, fam, ids, missing_vals = c("", 0))
Arguments
X |
The genotype matrix of the founders (loci along rows, individuals along columns).
This matrix must have column names that identify each founder (matching codes in |
fam |
The pedigree data.frame, in plink FAM format.
Only columns |
ids |
A list containing vectors of IDs for each generation.
All these IDs must be present in |
missing_vals |
The list of ID values treated as missing.
|
Value
The random genotype matrix of the last generation (the intersection of ids[ length(ids) ]
and fam$id
).
The columns of this matrix are last-generation individuals in the order that they appear in fam$id
.
The rows (loci) are the same as in the input X
.
See Also
Plink FAM format reference: https://www.cog-genomics.org/plink/1.9/formats#fam
Examples
# A small pedigree, two parents and two children.
# A minimal fam table with the three required columns.
# Note "mother" and "father" have missing parent IDs, while children do not
library(tibble)
fam <- tibble(
id = c('father', 'mother', 'child', 'sib'),
pat = c(NA, NA, 'father', 'father'),
mat = c(NA, NA, 'mother', 'mother')
)
# need an `ids` list separating the generations
ids <- list( c('father', 'mother'), c('child', 'sib') )
# genotypes of the parents at 4 loci
X <- cbind( c(1, 2, 0, 2), c(0, 2, 2, 1) )
# Name the parents with same codes as in `fam`
# (order can be different)
colnames( X ) <- c('mother', 'father')
# name loci too
rownames( X ) <- paste0( 'rs', 1:4 )
# Draw the genotype matrix of the children
X2 <- geno_last_gen( X, fam, ids )
X2