geno_to_char {genio}R Documentation

Convert a genotype matrix from numeric to character codes

Description

Given the genotype matrix X and bim table (as they are parsed by read_plink(), this outputs a matrix of the same dimensions as X but with the numeric codes (all values in 0, 1, 2) translated to human-readable character codes (such as 'A/A', 'A/G', 'G/G', depending on which are the two alleles at the locus as given in the bim table, see return value).

Usage

geno_to_char(X, bim)

Arguments

X

The genotype matrix. It must have values only in 0, 1, 2, and NA.

bim

The variant table. It is required to have the same number of rows as X, and to have at least two named columns alt and ref (alleles 1 and 2 in a plink BIM table). These alleles can be arbitrary strings (i.e. not just SNPs but also indels, any single or multicharacter code, or even blank strings) except the forward slash character ("/") is not allowed anywhere in these strings (function stops if a slash is present), since in the output it is the delimiter string. ref and alt alleles must be different at each locus.

Value

The genotype matrix reencoded as strings. At one locus, if the two alleles (alt and ref) are 'A' and 'B', then the genotypes in the input are encoded as characters as: 0 -> 'A/A', 1 -> 'B/A', and 2 -> 'B/B'. Thus, the numeric encoding counts the reference allele dosage. NA values in input X remain NA in the output. If the input genotype matrix had row and column names, these are inherited by the output matrix.

See Also

read_plink(), read_bed(), read_bim().

Examples

# a numeric/dosage genotype matrix with two loci (rows)
# and three individuals (columns)
X <- rbind( 0:2, c(0, NA, 2) )
# corresponding variant table (minimal case with just two required columns)
library(tibble)
bim <- tibble( alt = c('C', 'GT'), ref = c('A', 'G') )

# genotype matrix translated as characters
X_char <- geno_to_char( X, bim )
X_char


[Package genio version 1.1.2 Index]