scan_hh {rehh} | R Documentation |
Compute iHH, iES and inES over a whole chromosome
Description
Compute integrated EHH (iHH), integrated EHHS (iES) and integrated normalized EHHS (inES) for all markers of a chromosome (or linkage group).
Usage
scan_hh(
haplohh,
limhaplo = 2,
limhomohaplo = 2,
limehh = 0.05,
limehhs = 0.05,
phased = TRUE,
polarized = TRUE,
scalegap = NA,
maxgap = NA,
discard_integration_at_border = TRUE,
lower_ehh_y_bound = limehh,
lower_ehhs_y_bound = limehhs,
interpolate = TRUE,
threads = 1
)
Arguments
haplohh |
an object of class |
limhaplo |
if there are less than |
limhomohaplo |
if there are less than |
limehh |
limit at which EHH stops to be evaluated. |
limehhs |
limit at which EHHS stops to be evaluated. |
phased |
logical. If |
polarized |
logical. |
scalegap |
scale or cap gaps larger than the specified size to the specified size (default= |
maxgap |
maximum allowed gap in bp between two markers. If exceeded, further calculation of EHH(S) is stopped at the gap
(default= |
discard_integration_at_border |
logical. If |
lower_ehh_y_bound |
lower y boundary of the area to be integrated over (default: |
lower_ehhs_y_bound |
lower y boundary of the area to be integrated (default: |
interpolate |
logical. If |
threads |
number of threads to parallelize computation |
Details
Integrated EHH (iHH), integrated EHHS (iES) and integrated normalized EHHS (inES)
are computed for all markers of the chromosome (or linkage group). This function is several
times faster as a procedure calling in turn calc_ehh
and calc_ehhs
for all markers. To perform a whole genome-scan this function needs
to be called for each chromosome and results concatenated.
Note that setting limehh
or limehhs
to zero is likely to reduce power,
since even under neutrality a tiny fraction (<<0.05) of extremely long shared haplotypes is expected
which, if fully accounted for, would obfuscate the signal at selected sites.
Value
The returned value is a dataframe with markers in rows and the following columns
chromosome name
position in the chromosome
sample frequency of the ancestral / major allele
sample frequency of the second-most frequent remaining allele
number of evaluated haplotypes at the focal marker for the ancestral / major allele
number of evaluated haplotypes at the focal marker for the second-most frequent remaining allele
iHH of the ancestral / major allele
iHH of the second-most frequent remaining allele
iES (used by Sabeti et al 2007)
inES (used by Tang et al 2007)
Note that in case of unphased data the evaluation is restricted to haplotypes of homozygous individuals which reduces the power to detect selection, particularly for iHS (for appropriate parameter setting see the main vignette and Klassmann et al (2020)).
References
Gautier, M. and Naves, M. (2011). Footprints of selection in the ancestral admixture of a New World Creole cattle breed. Molecular Ecology, 20, 3128-3143.
Klassmann, A. and Gautier, M. (2020). Detecting selection using Extended Haplotype Homozygosity-based statistics on unphased or unpolarized data (preprint). https://doi.org/10.22541/au.160405572.29972398/v1
Sabeti, P.C. et al. (2002). Detecting recent positive selection in the human genome from haplotype structure. Nature, 419, 832-837.
Sabeti, P.C. et al. (2007). Genome-wide detection and characterization of positive selection in human populations. Nature, 449, 913-918.
Tang, K. and Thornton, K.R. and Stoneking, M. (2007). A New Approach for Using Genome Scans to Detect Recent Positive Selection in the Human Genome. Plos Biology, 7, e171.
Voight, B.F. and Kudaravalli, S. and Wen, X. and Pritchard, J.K. (2006). A map of recent positive selection in the human genome. Plos Biology, 4, e72.
See Also
data2haplohh
, calc_ehh
, calc_ehhs
ihh2ihs
,ines2rsb
, ies2xpehh
Examples
#example haplohh object (280 haplotypes, 1424 SNPs)
#see ?haplohh_cgu_bta12 for details
data(haplohh_cgu_bta12)
scan <- scan_hh(haplohh_cgu_bta12)