LD.thin {gaston}R Documentation

LD thinning

Description

Select SNPs in LD below a given threshold.

Usage

LD.thin(x, threshold, max.dist = 500e3, beg = 1, end = ncol(x),
        which.snps, dist.unit = c("bases", "indices", "cM"), 
        extract = TRUE, keep = c("left", "right", "random"))

Arguments

x

A bed.matrix

threshold

The maximum LD (measured by r^2) between SNPs

max.dist

The maximum distance for which the LD is computed

beg

The index of the first SNP to consider

end

The index of the last SNP to consider

which.snps

Logical vector, giving which SNPs are considerd. The default is to use all SNPs

dist.unit

Distance unit in max.dist

extract

A logical indicating whether the function return a bed.matrix (TRUE) or a logical vector indicating which SNPs are selected (FALSE)

keep

Which SNP is selected in a pair with LD above threshold

Details

The SNPs to keep are selected by a greedy algorithm. The LD is computed only for SNP pairs for which distance is inferior to max.dist, expressed in number of bases if dist.unit = "bases", in number of SNPs if dist.unit = "indices", or in centiMorgan if dist.unit = "cM".

The argument which.snps allows to consider only a subset of SNPs.

The algorithm tries to keep the largest possible number of SNPs: it is not appropriate to select tag-SNPs.

Value

If extract = TRUE, a bed.matrix extracted from x with SNPs in pairwise LD below the given threshold. If extract = FALSE, a logical vector of length end - beg + 1, where TRUE indicates that the corresponding SNPs is selected.

Author(s)

Hervé Perdry and Claire Dandine-Roulland

See Also

LD, set.dist

Examples

# Load data
data(TTN)
x <- as.bed.matrix(TTN.gen, TTN.fam, TTN.bim)

# Select SNPs in LD r^2 < 0.4, max.dist = 500 kb
y <- LD.thin(x, threshold = 0.4, max.dist = 500e3)
y

# Verifies that there is no SNP pair with LD r^2 > 0.4
# (note that the matrix ld.y has ones on the diagonal)
ld.y <- LD( y, lim = c(1, ncol(y)) )
sum( ld.y > 0.4 )  

[Package gaston version 1.6 Index]