sim_pedigree {simfam} | R Documentation |
Construct a random pedigree
Description
Specify the number of individuals per generation, and some other optional parameters, and a single pedigree with those properties will be simulated, where close relatives are never paired, sex is drawn randomly per individual and pairings are strictly across opposite-sex individuals, and otherwise closest individuals (on an underlying 1D geography given by their index) are paired in a random order. Pairs are reordered based on the average of their indexes, where their children are placed (determines their indexes in the 1D geography). The procedure may leave some individuals unpaired in the next generation, and family sizes vary randomly (with a fixed minimum family size) to achieve the desired population size in each generation.
Usage
sim_pedigree(
n,
G = length(n),
sex = draw_sex(n[1]),
kinship_local = diag(n[1])/2,
cutoff = 1/4^3,
children_min = 1L,
full = FALSE
)
Arguments
n |
The number of individuals per generation.
If scalar, the number of generations |
G |
The number of generations (optional).
Note |
sex |
The numeric sex values for the founders (1L for male, 2L for female).
By default they are drawn randomly using |
kinship_local |
The local kinship matrix of the founder population. The default value is half the identity matrix, which corresponds to locally unrelated and locally outbred founders. This "local" kinship is the basis for all kinship calculations used to decide on close relative avoidance. The goal is to make a decision to not pair close relatives based on the pedigree only (and not based on population structure, which otherwise increases all kinship values), so the default value is appropriate. |
cutoff |
Local kinship values strictly less than |
children_min |
The minimum number of children per family.
Must be 0 or larger, but not exceed the average number of children per family in each generation (varies depending on how many individuals were left unpaired, but this upper limit is approximately |
full |
If |
Value
A list with these named elements:
-
fam
: the pedigree, a tibble in plink FAM format. Following the column naming convention of the relatedgenio
package, it contains columns:-
fam
: Family ID, trivial "fam1" for all individuals -
id
: Individual ID, in this case a code of format (in regular expression) "(\d+)-(\d+)" where the first integer is the generation number and the second integer is the index number (1 ton[g]
for generationg
). -
pat
: Paternal ID. Matches anid
except for founders, which have fathers set toNA
. -
mat
: Maternal ID. Matches anid
except for founders, which have mothers set toNA
. -
sex
: integers 1L (male) or 2L (female) which were drawn randomly; no other values occur in these outputs. -
pheno
: Phenotype, here all 0 (missing value).
-
-
ids
: a list of IDs for each generation (indexed in the list by generation). -
kinship_local
: iffull = FALSE
, the local kinship matrix of the last generation, otherwise a list of local kinship matrices for every generation.
See Also
Plink FAM format reference: https://www.cog-genomics.org/plink/1.9/formats#fam
Examples
# number of individuals for each generation
n <- c(15, 20, 25)
# create random pedigree with 3 generations, etc
data <- sim_pedigree( n )
# this is the FAM table defining the entire pedigree,
# which is the most important piece of information desired!
data$fam
# the IDs separated by generation
data$ids
# bonus: the local kinship matrix of the final generation
data$kinship_local