R: Cluster SNPs with HDBSCAN and identify haplotypes

run_hdbscan_haplotyping {crosshap}

R Documentation

Cluster SNPs with HDBSCAN and identify haplotypes

Description

run_hdbscan_haplotyping() performs HDBSCAN clustering of SNPs in region of interest to identify marker groups. Individuals are classified by haplotype combination based on shared combinations of marker group alleles. Returns a comprehensive haplotyping object (HapObject), which can be visualized with reference to phenotype and metadata using crosshap_viz() (set epsilon to 1 as a dummy value).

Usage

run_hdbscan_haplotyping(
  vcf,
  LD,
  pheno,
  MGmin,
  minHap = 5,
  hetmiss_as = "allele",
  metadata = NULL,
  keep_outliers = FALSE
)

Arguments

`vcf`	Input VCF for region of interest.
`LD`	Pairwise correlation matrix of SNPs in region (e.g. from PLINK).
`pheno`	Input numeric phenotype data for each individual.
`MGmin`	Minimum SNPs in marker groups, MinPts parameter for DBscan.
`minHap`	Minimum nIndividuals in a haplotype combination.
`hetmiss_as`	If hetmiss_as = "allele", heterozygous-missing SNPs './N' are recoded as 'N/N', if hetmiss_as = "miss", the site is recoded as missing.
`metadata`	Metadata input (optional).
`keep_outliers`	When FALSE, marker group smoothing is performed to remove outliers.

Value

A comprehensive haplotyping S3 object (HapObject) for each provided epsilon value, needed for clustree_viz() and crosshap_viz().

[Package crosshap version 1.4.0 Index]