R: Identity-by-descent

snp_plinkIBDQC {bigsnpr}

R Documentation

Identity-by-descent

Description

Quality Control based on Identity-by-descent (IBD) computed by PLINK 1.9 using its method-of-moments.

Usage

snp_plinkIBDQC(
  plink.path,
  bedfile.in,
  bedfile.out = NULL,
  pi.hat = 0.08,
  ncores = 1,
  pruning.args = c(100, 0.2),
  do.blind.QC = TRUE,
  extra.options = "",
  verbose = TRUE
)

Arguments

`plink.path`	Path to the executable of PLINK 1.9.
`bedfile.in`	Path to the input bedfile.
`bedfile.out`	Path to the output bedfile. Default is created by appending `"_norel"` to `prefix.in` (`bedfile.in` without extension).
`pi.hat`	PI_HAT value threshold for individuals (first by pairs) to be excluded. Default is `0.08`.
`ncores`	Number of cores used. Default doesn't use parallelism. You may use nb_cores.
`pruning.args`	A vector of 2 pruning parameters, respectively the window size (in variant count) and the pairwise $r^2$ threshold (the step size is fixed to 1). Default is `c(100, 0.2)`.
`do.blind.QC`	Whether to do QC with `pi.hat` without visual inspection. Default is `TRUE`. If `FALSE`, return the `data.frame` of the corresponding ".genome" file without doing QC. One could use `ggplot2::qplot(Z0, Z1, data = mydf, col = RT)` for visual inspection.
`extra.options`	Other options to be passed to PLINK as a string (for the IBD part). More options can be found at https://www.cog-genomics.org/plink/1.9/ibd.
`verbose`	Whether to show PLINK log? Default is `TRUE`.

Value

The path of the new bedfile. If no sample is filter, no new bed/bim/fam files are created and then the path of the input bedfile is returned.

References

Chang, Christopher C, Carson C Chow, Laurent CAM Tellier, Shashaank Vattikuti, Shaun M Purcell, and James J Lee. 2015. Second-generation PLINK: rising to the challenge of larger and richer datasets. GigaScience 4 (1): 7. doi:10.1186/s13742-015-0047-8.

Examples

## Not run: 

bedfile <- system.file("extdata", "example.bed", package = "bigsnpr")
plink <- download_plink()

bedfile <- snp_plinkIBDQC(plink, bedfile,
                          bedfile.out = tempfile(fileext = ".bed"),
                          ncores = 2)

df_rel <- snp_plinkIBDQC(plink, bedfile, do.blind.QC = FALSE, ncores = 2)
str(df_rel)

library(ggplot2)
qplot(Z0, Z1, data = df_rel, col = RT)
qplot(y = PI_HAT, data = df_rel) +
  geom_hline(yintercept = 0.2, color = "blue", linetype = 2)
snp_plinkRmSamples(plink, bedfile,
                   bedfile.out = tempfile(fileext = ".bed"),
                   df.or.files = subset(df_rel, PI_HAT > 0.2))

## End(Not run)

[Package bigsnpr version 1.12.2 Index]