snp_match {bigsnpr} | R Documentation |
Match alleles
Description
Match alleles between summary statistics and SNP information. Match by ("chr", "a0", "a1") and ("pos" or "rsid"), accounting for possible strand flips and reverse reference alleles (opposite effects).
Usage
snp_match(
sumstats,
info_snp,
strand_flip = TRUE,
join_by_pos = TRUE,
remove_dups = TRUE,
match.min.prop = 0.2,
return_flip_and_rev = FALSE
)
Arguments
sumstats |
A data frame with columns "chr", "pos", "a0", "a1" and "beta". |
info_snp |
A data frame with columns "chr", "pos", "a0" and "a1". |
strand_flip |
Whether to try to flip strand? (default is |
join_by_pos |
Whether to join by chromosome and position (default), or instead by rsid. |
remove_dups |
Whether to remove duplicates (same physical position)?
Default is |
match.min.prop |
Minimum proportion of variants in the smallest data
to be matched, otherwise stops with an error. Default is |
return_flip_and_rev |
Whether to return internal boolean variables
|
Value
A single data frame with matched variants. Values in column $beta
are multiplied by -1 for variants with alleles reversed (i.e. swapped).
New variable "_NUM_ID_.ss"
returns the corresponding row indices of the
input sumstats
(first argument of this function), and "_NUM_ID_"
corresponding to the input info_snp
(second argument).
See Also
Examples
sumstats <- data.frame(
chr = 1,
pos = c(86303, 86331, 162463, 752566, 755890, 758144),
a0 = c("T", "G", "C", "A", "T", "G"),
a1 = c("G", "A", "T", "G", "A", "A"),
beta = c(-1.868, 0.250, -0.671, 2.112, 0.239, 1.272),
p = c(0.860, 0.346, 0.900, 0.456, 0.776, 0.383)
)
info_snp <- data.frame(
id = c("rs2949417", "rs115209712", "rs143399298", "rs3094315", "rs3115858"),
chr = 1,
pos = c(86303, 86331, 162463, 752566, 755890),
a0 = c("T", "A", "G", "A", "T"),
a1 = c("G", "G", "A", "G", "A")
)
snp_match(sumstats, info_snp)
snp_match(sumstats, info_snp, strand_flip = FALSE)