processEigenstrat {BREADR} | R Documentation |
process Eigenstrat data
Description
A function that takes paths to an eigenstrat trio (ind, snp and geno file) and returns the pairwise mismatch rate for all pairs on a thinned set of SNPs. Options include choosing thinning parameter, subsetting by population names, and filtering out SNPs for which deamination is possible.
Usage
processEigenstrat(
indfile,
genofile,
snpfile,
filter_length = NULL,
pop_pattern = NULL,
filter_deam = FALSE,
outfile = NULL,
chromosomes = NULL,
verbose = TRUE
)
Arguments
indfile |
path to eigenstrat ind file |
genofile |
path to eigenstrat geno file. |
snpfile |
path to eigenstrat snp file. |
filter_length |
the minimum distance between sites to be compared (to reduce the effect of LD). |
pop_pattern |
a character vector of population names to filter the ind file if only some populations are to compared. |
filter_deam |
a TRUE/FALSE for if C->T and G->A sites should be ignored. |
outfile |
(OPTIONAL) a path and filename to which we can save the output of the function as a TSV, if NULL, no back up saved. If no outfile, then a tibble is returned. |
chromosomes |
the chromosome to filter the data on. |
verbose |
controls printing of messages to console |
Value
out_tibble: A tibble containing four columns:
Examples
# Use internal files to the package as an example
indfile <- system.file("extdata", "example.ind.txt", package = "BREADR")
genofile <- system.file("extdata", "example.geno.txt", package = "BREADR")
snpfile <- system.file("extdata", "example.snp.txt", package = "BREADR")
processEigenstrat(
indfile, genofile, snpfile,
filter_length=1e5,
pop_pattern=NULL,
filter_deam=FALSE
)