distance_thin {SNPfiltR} | R Documentation |
Filter a vcf file based on distance between SNPs on a given scaffold
Description
This function requires a vcfR object as input, and returns a vcfR object filtered to retain only SNPs greater than a specified distance apart on each scaffold. The function starts by automatically retaining the first SNP on a given scaffold, and then subsequently keeping the next SNP that is greater than the specified distance away, until it reaches the end of the scaffold/chromosome. This function scales well with an increasing number of SNPs, but poorly with an increasing number of scaffolds/chromosomes. For this reason, there is a built in progress bar, to monitor potentially long-running executions with many scaffolds. This type of filtering is often employed to reduce linkage among input SNPs, especially for downstream input to programs like structure, which require unlinked SNPs.
Usage
distance_thin(vcfR, min.distance = NULL)
Arguments
vcfR |
a vcfR object |
min.distance |
a numeric value representing the smallest distance (in base-pairs) allowed between SNPs after distance thinning |
Value
An identical vcfR object, except that SNPs separated by less than the specified distance have been removed from the file
Examples
distance_thin(vcfR = SNPfiltR::vcfR.example, min.distance = 1000)