read_grm {genio} | R Documentation |
Read GCTA GRM and related plink2 binary files
Description
This function reads a GCTA Genetic Relatedness Matrix (GRM, i.e. kinship) set of files in their binary format, returning the kinship matrix and, if available, the corresponding matrix of pair sample sizes (non-trivial under missingness) and individuals table. Setting some options allows reading plink2 binary kinship formats such as "king" (see examples).
Usage
read_grm(
name,
n_ind = NA,
verbose = TRUE,
ext = "grm",
shape = c("triangle", "strict_triangle", "square"),
size_bytes = 4,
comment = "#"
)
Arguments
name |
The base name of the input files.
Files with that base, plus shared extension (default "grm", see |
n_ind |
The number of individuals, required if the file with the extension |
verbose |
If |
ext |
Shared extension for all three inputs (see |
shape |
The shape of the information to read (may be abbreviated).
Default "triangle" assumes there are |
size_bytes |
The number of bytes per number encoded. Default 4 corresponds to GCTA GRM and plink2 "bin4", whereas plink2 "bin" requires a value of 8. |
comment |
Character to start comments in |
Value
A list with named elements:
-
kinship
: The symmetricn
-times-n
kinship matrix (GRM). Has IDs as row and column names if the file with extension.<ext>.id
exists. Ifshape='strict_triangle'
, diagonal will have missing values. -
M
: The symmetricn
-times-n
matrix of pair sample sizes (number of non-missing loci pairs), if the file with extension.<ext>.N.bin
exists. Has IDs as row and column names if the file with extension.<ext>.id
was available. Ifshape='strict_triangle'
, diagonal will have missing values. -
fam
: A tibble with two columns:fam
andid
, same as in Plink FAM files. Returned if the file with extension.<ext>.id
exists.
See Also
Greatly adapted from sample code from GCTA: https://cnsgenomics.com/software/gcta/#MakingaGRM
Examples
# to read "data.grm.bin" and etc, run like this:
# obj <- read_grm("data")
# obj$kinship # the kinship matrix
# obj$M # the pair sample sizes matrix
# obj$fam # the fam and ID tibble
# The following example is more awkward
# because package sample data has to be specified in this weird way:
# read an existing set of GRM files
file <- system.file("extdata", 'sample.grm.bin', package = "genio", mustWork = TRUE)
file <- sub('\\.grm\\.bin$', '', file) # remove extension from this path on purpose
obj <- read_grm(file)
obj$kinship # the kinship matrix
obj$M # the pair sample sizes matrix
obj$fam # the fam and ID tibble
# Read sample plink2 KING-robust files (several variants).
# Read both base.king.bin and base.king.id files.
# All generated with "plink2 <input> --make-king <options> --out base"
# (replace "base" with actual base name) with these options:
# #1) "triangle bin"
# data <- read_grm( 'base', ext = 'king', shape = 'strict', size_bytes = 8 )
# #2) "triangle bin4"
# data <- read_grm( 'base', ext = 'king', shape = 'strict' )
# #3) "square bin"
# data <- read_grm( 'base', ext = 'king', shape = 'square', size_bytes = 8 )
# #4) "square bin4"
# data <- read_grm( 'base', ext = 'king', shape = 'square' )