ghap.kinship {GHap} | R Documentation |
Relationship matrix based on genomic data
Description
This function computes marker-based and HapAllele-based relationship matrices.
Usage
ghap.kinship(object, weights = NULL,
sparsity = NULL,
type = 1, batchsize = NULL,
only.active.samples = TRUE,
only.active.variants = TRUE,
ncores = 1, verbose = TRUE)
Arguments
object |
A valid GHap object (phase, haplo or plink). |
weights |
A numeric vector providing variant-specific weights. |
sparsity |
A numeric value specifying a relationship cut-off (default = NULL). All relationships below the specified cut-off will be set to zero, inducing sparsity into the relationship matrix. |
type |
A numeric value indicating the type of relationship matrix (see details). |
batchsize |
A numeric value controlling the number of variants to be processed at a time (default = nalleles/10). |
only.active.samples |
A logical value specifying whether only active samples should be included in the output (default = TRUE). |
only.active.variants |
A logical value specifying whether only active variants should be included in the output (default = TRUE). |
ncores |
A numeric value specifying the number of cores to be used in parallel computations (default = 1). |
verbose |
A logical value specfying whether log messages should be printed (default = TRUE). |
Details
Let \mathbf{M}
be the n x m matrix of genotypes, where n is the number of individuals and m is the number of variants (i.e, markers or HapAlleles). Prior to computation, genotypes in matrix \mathbf{M}
are transformed according to the desired relationship type. After that transformation, the relationship matrix is computed as:
\mathbf{K} = q^{-1}\mathbf{MDM}'
where \mathbf{D} = diag(d_i)
, d_i
is the weight of variant i (default d_i = 1
), and q
is a scaling factor. The argument type controls the genotype transformation and the scaling factor, and includes the following option:
Type = 1 (General additive 1)
Genotype transformation: m - mean(m)
Scaling factor: tr(\mathbf{MDM}')^{-1}n
Type = 2 (General additive 2)
Genotype transformation: (m - mean(m))/sd(m)
Scaling factor: m
Type = 3 (VanRaden, 2008)
Genotype transformation: m - 2*p[j]
Scaling factor: 2*sum(p*(1-p))
Type = 4 (GCTA)
Genotype transformation: (m - 2*p[j])/sqrt(2*p[j]*(1-p[j]))
Scaling factor: m
Type = 5 (Dominance 1)
Genotype transformation:
0 = -2*p[j]^2
1 = 2*p[j]*(1-p[j])
2 = -2*(1-p[j])^2
Scaling factor: 4*sum(p^2*(1-p)^2).
Type = 6 (Dominance 2)
Genotype transformation:
0 = -2*p[j]^2
1 = 2*p[j]*(1-p[j])
2 = -2*(1-p[j])^2
Scaling factor:
tr(\mathbf{MDM}')^{-1}n
.
Value
The function returns a n x n relationship matrix, where n is the number of individuals.
Author(s)
Yuri Tani Utsunomiya <ytutsunomiya@gmail.com>
Marco Milanesi <marco.milanesi.mm@gmail.com>
References
P. M. VanRaden. Efficient methods to compute genomic predictions. J. Dairy. Sci. 2008. 91:4414-4423.
Examples
# #### DO NOT RUN IF NOT NECESSARY ###
#
# # Copy plink data in the current working directory
# exfiles <- ghap.makefile(dataset = "example",
# format = "plink",
# verbose = TRUE)
# file.copy(from = exfiles, to = "./")
#
# # Copy metadata in the current working directory
# exfiles <- ghap.makefile(dataset = "example",
# format = "meta",
# verbose = TRUE)
# file.copy(from = exfiles, to = "./")
#
# # Load plink data
# plink <- ghap.loadplink("example")
#
# # Load phenotype data
# df <- read.table(file = "example.phenotypes", header=T)
#
# ### RUN ###
#
# # Subset pure1 population
# pure1 <- plink$id[which(plink$pop == "Pure1")]
# plink <- ghap.subset(object = plink, ids = pure1, variants = plink$marker)
#
# # Compute different types of relationship matrices
# K1 <- ghap.kinship(plink, type = 1) # General additive 1
# K2 <- ghap.kinship(plink, type = 2) # General additive 2
# K3 <- ghap.kinship(plink, type = 3) # VanRaden 2008
# K4 <- ghap.kinship(plink, type = 4) # GCTA GRM
# K5 <- ghap.kinship(plink, type = 5) # Dominance 1
# K6 <- ghap.kinship(plink, type = 6) # Dominance 2