encode_genotypes {eHDPrep} | R Documentation |
Encode genotype/SNP variables in data frame
Description
Standardises homozygous SNPs (e.g. recorded as "A") to two character form (e.g. "A/A") and orders heterozygous SNPs alphabetically (e.g. "GA" becomes "A/G"). The SNP values are then converted from a character vector to an ordered factor, ordered by observed allele frequency (in the supplied cohort). The most frequent allele is assigned level 1, the second most frequent value is assigned level 2, and the least frequent values is assigned level 3). This method embeds the numeric relationship between the allele frequencies while preserving value labels.
Usage
encode_genotypes(data, ...)
Arguments
data |
A data frame, data frame extension (e.g. a tibble), or a lazy data frame (e.g. from dbplyr or dtplyr). |
... |
< |
Value
'data' with variables (...
) encoded as standardised genotypes
Examples
data(example_data)
require(dplyr)
require(magrittr)
# one variable
encode_genotypes(example_data, SNP_a) %>%
select(SNP_a)
# multiple variables
encode_genotypes(example_data, SNP_a, SNP_b) %>%
select(SNP_a, SNP_b)
# using tidyselect helpers
encode_genotypes(example_data, dplyr::starts_with("SNP")) %>%
select(starts_with("SNP"))