gl.report.hwe {dartR}R Documentation

Reports departure from Hardy-Weinberg proportions

Description

Calculates the probabilities of agreement with H-W proportions based on observed frequencies of reference homozygotes, heterozygotes and alternate homozygotes.

Usage

gl.report.hwe(
  x,
  subset = "each",
  method_sig = "Exact",
  multi_comp = FALSE,
  multi_comp_method = "BY",
  alpha_val = 0.05,
  pvalue_type = "midp",
  cc_val = 0.5,
  sig_only = TRUE,
  min_sample_size = 5,
  plot.out = TRUE,
  plot_colors = two_colors_contrast,
  max_plots = 4,
  save2tmp = FALSE,
  verbose = NULL
)

Arguments

x

Name of the genlight object containing the SNP data [required].

subset

Way to group individuals to perform H-W tests. Either a vector with population names, 'each', 'all' (see details) [default 'each'].

method_sig

Method for determining statistical significance: 'ChiSquare' or 'Exact' [default 'Exact'].

multi_comp

Whether to adjust p-values for multiple comparisons [default FALSE].

multi_comp_method

Method to adjust p-values for multiple comparisons: 'holm', 'hochberg', 'hommel', 'bonferroni', 'BH', 'BY', 'fdr' (see details) [default 'fdr'].

alpha_val

Level of significance for testing [default 0.05].

pvalue_type

Type of p-value to be used in the Exact method. Either 'dost','selome','midp' (see details) [default 'midp'].

cc_val

The continuity correction applied to the ChiSquare test [default 0.5].

sig_only

Whether the returned table should include loci with a significant departure from Hardy-Weinberg proportions [default TRUE].

min_sample_size

Minimum number of individuals per population in which perform H-W tests [default 5].

plot.out

If TRUE, will produce Ternary Plot(s) [default TRUE].

plot_colors

Vector with two color names for the significant and not-significant loci [default two_colors_contrast].

max_plots

Maximum number of plots to print per page [default 4].

save2tmp

If TRUE, saves any ggplots and listings to the session temporary directory (tempdir) [default FALSE].

verbose

Verbosity: 0, silent or fatal errors; 1, begin and end; 2, progress log ; 3, progress and results summary; 5, full report [default NULL, unless specified using gl.set.verbosity].

Details

There are several factors that can cause deviations from Hardy-Weinberg proportions including: mutation, finite population size, selection, population structure, age structure, assortative mating, sex linkage, nonrandom sampling and genotyping errors. Therefore, testing for Hardy-Weinberg proportions should be a process that involves a careful evaluation of the results, a good place to start is Waples (2015).

Note that tests for H-W proportions are only valid if there is no population substructure (assuming random mating) and have sufficient power only when there is sufficient sample size (n individuals > 15).

Populations can be defined in three ways:

Two different statistical methods to test for deviations from Hardy Weinberg proportions:

Correction for multiple tests can be applied using the following methods based on the function p.adjust:

The first four methods are designed to give strong control of the family-wise error rate. The last two methods control the false discovery rate (FDR), the expected proportion of false discoveries among the rejected hypotheses. The false discovery rate is a less stringent condition than the family-wise error rate, so these methods are more powerful than the others, especially when number of tests is large. The number of tests on which the adjustment for multiple comparisons is the number of populations times the number of loci.

Ternary plots

Ternary plots can be used to visualise patterns of H-W proportions (plot.out = TRUE). P-values and the statistical (non)significance of a large number of bi-allelic markers can be inferred from their position in a ternary plot. See Graffelman & Morales-Camarena (2008) for further details. Ternary plots are based on the function HWTernaryPlot from the package HardyWeinberg. Each vertex of the Ternary plot represents one of the three possible genotypes for SNP data: homozygous for the reference allele (AA), heterozygous (AB) and homozygous for the alternative allele (BB). Loci deviating significantly from Hardy-Weinberg proportions after correction for multiple tests are shown in pink. The blue parabola represents Hardy-Weinberg equilibrium, and the area between green lines represents the acceptance region.

For these plots to work it is necessary to install the package ggtern.

Value

A dataframe containing loci, counts of reference SNP homozygotes, heterozygotes and alternate SNP homozygotes; probability of departure from H-W proportions, per locus significance with and without correction for multiple comparisons and the number of population where the same locus is significantly out of HWE.

Author(s)

Custodian: Luis Mijangos – Post to https://groups.google.com/d/forum/dartr

References

See Also

gl.filter.hwe

Other report functions: gl.report.bases(), gl.report.callrate(), gl.report.diversity(), gl.report.hamming(), gl.report.heterozygosity(), gl.report.ld.map(), gl.report.locmetric(), gl.report.maf(), gl.report.monomorphs(), gl.report.overshoot(), gl.report.parent.offspring(), gl.report.pa(), gl.report.rdepth(), gl.report.reproducibility(), gl.report.secondaries(), gl.report.sexlinked(), gl.report.taglength()


[Package dartR version 2.9.7 Index]