mat_gen_dist {graph4lg}R Documentation

Compute a pairwise matrix of genetic distances between populations

Description

The function computes a pairwise matrix of genetic distances between populations and allows to implement several formula.

Usage

mat_gen_dist(x, dist = "basic", null_val = FALSE)

Arguments

x

An object of class genind that contains the multilocus genotypes (format 'locus') of the individuals as well as their populations.

dist

A character string indicating the method used to compute the multilocus genetic distance between populations

  • If 'dist = 'basic” (default), then the multilocus genetic distance is computed using a formula of Euclidean genetic distance (Excoffier et al., 1992)

  • If 'dist = 'weight”, then the multilocus genetic distance is computed as in Fortuna et al. (2009). It is a Euclidean genetic distance giving more weight to rare alleles

  • If 'dist = 'PG”, then the multilocus genetic distance is computed as in popgraph::popgraph function, following several steps of PCA and SVD (Dyer et Nason, 2004).

  • If 'dist = 'DPS”, then the genetic distance used is equal to 1 - the proportion of shared alleles (Bowcock, 1994)

  • If 'dist = 'FST”, then the genetic distance used is the pairwise FST (Weir et Cockerham, 1984)

  • If 'dist = 'FST_lin”, then the genetic distance used is the linearised pairwise FST (Weir et Cockerham, 1984)(FST_lin = FST/(1-FST))

  • If 'dist = 'PCA”, then the genetic distance is computed following a PCA of the matrix of allelic frequencies by population. It is a Euclidean genetic distance between populations in the multidimensional space defined by all the independent principal components.

  • If 'dist = 'GST”, then the genetic distance used is the G'ST (Hedrick, 2005). See graph4lg <= 1.6.0 only, because it used diveRsity

  • If 'dist = 'D”, then the genetic distance used is Jost's D (Jost, 2008). See graph4lg <= 1.6.0 only, because it used diveRsity

null_val

(optional) Logical. Should negative and null FST, FST_lin, GST or D values be replaced by half the minimum positive value? This option allows to compute Gabriel graphs from these "distances". Default is null_val = FALSE. This option only works if 'dist = 'FST” or 'FST_lin' or 'GST' or 'D'

Details

Negative values are converted into 0. Euclidean genetic distance d_{ij} between population i and j is computed as follows:

d_{ij}^{2} = \sum_{k=1}^{n} (x_{ki} - x_{kj})^{2}

where x_{ki} is the allelic frequency of allele k in population i and n is the total number of alleles. Note that when 'dist = 'weight”, the formula becomes

d_{ij}^{2} = \sum_{k=1}^{n} (1/(K*p_{k}))(x_{ki} - x_{kj})^{2}

where K is the number of alleles at the locus of the allele k and p_{k} is the frequency of the allele k in all populations. Note that when 'dist = 'PCA”, n is the number of conserved independent principal components and x_{ki} is the value taken by the principal component k in population i.

Value

An object of class matrix

Author(s)

P. Savary

References

Bowcock AM, Ruiz-Linares A, Tomfohrde J, Minch E, Kidd JR, Cavalli-Sforza LL (1994). “High resolution of human evolutionary trees with polymorphic microsatellites.” nature, 368(6470), 455–457. Excoffier L, Smouse PE, Quattro JM (1992). “Analysis of molecular variance inferred from metric distances among DNA haplotypes: application to human mitochondrial DNA restriction data.” Genetics, 131(2), 479–491. Dyer RJ, Nason JD (2004). “Population graphs: the graph theoretic shape of genetic structure.” Molecular ecology, 13(7), 1713–1727. Fortuna MA, Albaladejo RG, Fernández L, Aparicio A, Bascompte J (2009). “Networks of spatial genetic variation across species.” Proceedings of the National Academy of Sciences, 106(45), 19044–19049. Weir BS, Cockerham CC (1984). “Estimating F-statistics for the analysis of population structure.” evolution, 38(6), 1358–1370. Hedrick PW (2005). “A standardized genetic differentiation measure.” Evolution, 59(8), 1633–1638. Jost L (2008). “GST and its relatives do not measure differentiation.” Molecular ecology, 17(18), 4015–4026.

Examples

data(data_ex_genind)
x <- data_ex_genind
D <- mat_gen_dist(x = x, dist = "basic")

[Package graph4lg version 1.8.0 Index]