R: Compute the probability distribution of total number of...

pr_total_number_of_distinct_alleles {numberofalleles}

R Documentation

Compute the probability distribution of total number of distinct alleles in a DNA mixture

Description

Compute the probability distribution of total number of distinct alleles in a DNA mixture

Usage

pr_total_number_of_distinct_alleles(
  contributors,
  freqs,
  pedigree,
  dropout_prs = rep(0, length(contributors)),
  fst = 0,
  loci = names(freqs)
)

Arguments

`contributors`	Character vector with unique names of contributors. Valid names are "U1", "U2", ... for unrelated contributors or the names of pedigree members for related contributors.
`freqs`	Allele frequencies (see read_allele_freqs)
`pedigree`	(optionally) ped object
`dropout_prs`	Numeric vector. Dropout probabilities per contributor. Defaults to zeroes.
`fst`	Numeric. Defaults to 0.
`loci`	Character vector of locus names (defaults to names attr. of `freqs`)

Details

A DNA mixture of n contributors contains 2n independent alleles per locus if the contributors are unrelated; fewer if they are related. This function computes the probability distribution of the total number of distinct alleles observed across all loci. Mixture contributors may be related according to an optionally specified pedigree. Optionally, a sub-population correction may be applied by setting fst>0.

The case where all contributors are unrelated was discussed by Tvedebrink (2014) and is implemented in the DNAtools package. Kruijver & Curran (2022) extended this to include related contributors by exploiting the multiPersonIBD function in the ribd package.

Value

an object of class pf. This is a list containing:

pf. A named numeric vector describing the probability distribution of the total number of alleles. Numeric values are the probabilities corresponding to the names describing integer values.
by_locus. A list of probability distributions by locus.
noa. For convenience, an integer vector with the number of alleles corresponding to the probability distribution pf (the names attribute as integer vector)
min. For convenience, the minimum of noa
max. For convenience, the maximum of noa

References

M. Kruijver & J.Curran (2022). 'The number of alleles in DNA mixtures with related contributors', manuscript submitted

T. Tvedebrink (2014). 'On the exact distribution of the number of alleles in DNA mixtures', International Journal of Legal Medicine; 128(3):427–37. doi: 10.1007/s00414-013-0951-3

Examples

# define a pedigree of siblings S1 and S2 (and their parents)
ped_sibs <- pedtools::nuclearPed(children = c("S1", "S2"))

# define allele frequencies
freqs <- list(locus1 = c(0.1, 0.9),
              locus2 = c(0.25, 0.5, 0.25))

# compute dist. of number of alleles for two siblings and one unrelated persons
pr_total_number_of_distinct_alleles(contributors = c("S1","S2","U1"), freqs,
                                    pedigree = ped_sibs)

## GlobalFiler example (2 unrelated contributors)
freqs <- read_allele_freqs(system.file("extdata","FBI_extended_Cauc.csv",
package = "numberofalleles"))

gf_loci <- c("D3S1358", "vWA", "D16S539", "CSF1PO", "TPOX", "D8S1179",
             "D21S11",  "D18S51", "D2S441", "D19S433", "TH01", "FGA",
             "D22S1045", "D5S818", "D13S317", "D7S820", "SE33",
             "D10S1248", "D1S1656", "D12S391", "D2S1338")

p_gf <- pr_total_number_of_distinct_alleles(contributors = c("U1", "U2"),
                                            freqs = freqs, loci = gf_loci)

barplot(p_gf$pf)

[Package numberofalleles version 1.0.1 Index]