run_hdbscan_haplotyping {crosshap}R Documentation

Cluster SNPs with HDBSCAN and identify haplotypes

Description

run_hdbscan_haplotyping() performs HDBSCAN clustering of SNPs in region of interest to identify marker groups. Individuals are classified by haplotype combination based on shared combinations of marker group alleles. Returns a comprehensive haplotyping object (HapObject), which can be visualized with reference to phenotype and metadata using crosshap_viz() (set epsilon to 1 as a dummy value).

Usage

run_hdbscan_haplotyping(
  vcf,
  LD,
  pheno,
  MGmin,
  minHap = 5,
  hetmiss_as = "allele",
  metadata = NULL,
  keep_outliers = FALSE
)

Arguments

vcf

Input VCF for region of interest.

LD

Pairwise correlation matrix of SNPs in region (e.g. from PLINK).

pheno

Input numeric phenotype data for each individual.

MGmin

Minimum SNPs in marker groups, MinPts parameter for DBscan.

minHap

Minimum nIndividuals in a haplotype combination.

hetmiss_as

If hetmiss_as = "allele", heterozygous-missing SNPs './N' are recoded as 'N/N', if hetmiss_as = "miss", the site is recoded as missing.

metadata

Metadata input (optional).

keep_outliers

When FALSE, marker group smoothing is performed to remove outliers.

Value

A comprehensive haplotyping S3 object (HapObject) for each provided epsilon value, needed for clustree_viz() and crosshap_viz().


[Package crosshap version 1.4.0 Index]