R: Relationship matrix based on genomic data

ghap.kinship {GHap}

R Documentation

Relationship matrix based on genomic data

Description

This function computes marker-based and HapAllele-based relationship matrices.

Usage

ghap.kinship(object, weights = NULL,
             sparsity = NULL,
             type = 1, batchsize = NULL,
             only.active.samples = TRUE,
             only.active.variants = TRUE,
             ncores = 1, verbose = TRUE)

Arguments

`object`	A valid GHap object (phase, haplo or plink).
`weights`	A numeric vector providing variant-specific weights.
`sparsity`	A numeric value specifying a relationship cut-off (default = NULL). All relationships below the specified cut-off will be set to zero, inducing sparsity into the relationship matrix.
`type`	A numeric value indicating the type of relationship matrix (see details).
`batchsize`	A numeric value controlling the number of variants to be processed at a time (default = nalleles/10).
`only.active.samples`	A logical value specifying whether only active samples should be included in the output (default = TRUE).
`only.active.variants`	A logical value specifying whether only active variants should be included in the output (default = TRUE).
`ncores`	A numeric value specifying the number of cores to be used in parallel computations (default = 1).
`verbose`	A logical value specfying whether log messages should be printed (default = TRUE).

Details

Let \mathbf{M} be the n x m matrix of genotypes, where n is the number of individuals and m is the number of variants (i.e, markers or HapAlleles). Prior to computation, genotypes in matrix \mathbf{M} are transformed according to the desired relationship type. After that transformation, the relationship matrix is computed as:

\mathbf{K} = q^{-1}\mathbf{MDM}'

where \mathbf{D} = diag(d_i), d_i is the weight of variant i (default d_i = 1), and q is a scaling factor. The argument type controls the genotype transformation and the scaling factor, and includes the following option:
Type = 1 (General additive 1)
Genotype transformation: m - mean(m)
Scaling factor: tr(\mathbf{MDM}')^{-1}n

Type = 2 (General additive 2)
Genotype transformation: (m - mean(m))/sd(m)
Scaling factor: m

Type = 3 (VanRaden, 2008)
Genotype transformation: m - 2*p[j]
Scaling factor: 2*sum(p*(1-p))

Type = 4 (GCTA)
Genotype transformation: (m - 2*p[j])/sqrt(2*p[j]*(1-p[j]))
Scaling factor: m

Type = 5 (Dominance 1)
Genotype transformation:
0 = -2*p[j]^2
1 = 2*p[j]*(1-p[j])
2 = -2*(1-p[j])^2
Scaling factor: 4*sum(p^2*(1-p)^2).

Type = 6 (Dominance 2)
Genotype transformation:
0 = -2*p[j]^2
1 = 2*p[j]*(1-p[j])
2 = -2*(1-p[j])^2
Scaling factor:
tr(\mathbf{MDM}')^{-1}n.

Value

The function returns a n x n relationship matrix, where n is the number of individuals.

Author(s)

Yuri Tani Utsunomiya <ytutsunomiya@gmail.com>
Marco Milanesi <marco.milanesi.mm@gmail.com>

References

P. M. VanRaden. Efficient methods to compute genomic predictions. J. Dairy. Sci. 2008. 91:4414-4423.

Examples


# #### DO NOT RUN IF NOT NECESSARY ###
# 
# # Copy plink data in the current working directory
# exfiles <- ghap.makefile(dataset = "example",
#                          format = "plink",
#                          verbose = TRUE)
# file.copy(from = exfiles, to = "./")
# 
# # Copy metadata in the current working directory
# exfiles <- ghap.makefile(dataset = "example",
#                          format = "meta",
#                          verbose = TRUE)
# file.copy(from = exfiles, to = "./")
# 
# # Load plink data
# plink <- ghap.loadplink("example")
# 
# # Load phenotype data
# df <- read.table(file = "example.phenotypes", header=T)
# 
# ### RUN ###
# 
# # Subset pure1 population
# pure1 <- plink$id[which(plink$pop == "Pure1")]
# plink <- ghap.subset(object = plink, ids = pure1, variants = plink$marker)
# 
# # Compute different types of relationship matrices
# K1 <- ghap.kinship(plink, type = 1) # General additive 1
# K2 <- ghap.kinship(plink, type = 2) # General additive 2
# K3 <- ghap.kinship(plink, type = 3) # VanRaden 2008
# K4 <- ghap.kinship(plink, type = 4) # GCTA GRM
# K5 <- ghap.kinship(plink, type = 5) # Dominance 1
# K6 <- ghap.kinship(plink, type = 6) # Dominance 2

[Package GHap version 3.0.0 Index]