getG {BGData}  R Documentation 
Computes a positive semidefinite symmetric genomic relation matrix G=XX'
offering options for centering and scaling the columns of X
beforehand.
getG(X, center = TRUE, scale = TRUE, impute = TRUE, scaleG = TRUE,
minVar = 1e05, i = seq_len(nrow(X)), j = seq_len(ncol(X)), i2 = NULL,
chunkSize = 5000L, nCores = getOption("mc.cores", 2L), verbose = FALSE)
X 
A matrixlike object, typically the genotypes of a 
center 
Either a logical value or a numeric vector of length equal to the
number of columns of 
scale 
Either a logical value or a numeric vector of length equal to the
number of columns of 
impute 
Indicates whether missing values should be imputed. Defaults to

scaleG 
Whether XX' should be scaled. Defaults to 
minVar 
Columns with variance lower than this value will not be used in the
computation (only if 
i 
Indicates which rows of 
j 
Indicates which columns of 
i2 
Indicates which rows should be used to compute a block of the genomic
relationship matrix. Will compute XY' where X is determined by 
chunkSize 
The number of columns of 
nCores 
The number of cores (passed to 
verbose 
Whether progress updates will be posted. Defaults to 
If center = FALSE
, scale = FALSE
and scaleG = FALSE
,
getG
produces the same outcome than tcrossprod
.
A positive semidefinite symmetric numeric matrix.
filebackedmatrices
for more information on filebacked
matrices. multilevelparallelism
for more information on
multilevel parallelism. BGDataclass
for more information on
the BGData
class.
# Restrict number of cores to 1 on Windows
if (.Platform$OS.type == "windows") {
options(mc.cores = 1)
}
# Load example data
bg < BGData:::loadExample()
# Compute a scaled genomic relationship matrix from centered and scaled
# genotypes
g1 < getG(X = geno(bg))
# Disable scaling of G
g2 < getG(X = geno(bg), scaleG = FALSE)
# Disable centering of genotypes
g3 < getG(X = geno(bg), center = FALSE)
# Disable scaling of genotypes
g4 < getG(X = geno(bg), scale = FALSE)
# Provide own scales
scales < chunkedApply(X = geno(bg), MARGIN = 2, FUN = sd)
g4 < getG(X = geno(bg), scale = scales)
# Provide own centers
centers < chunkedApply(X = geno(bg), MARGIN = 2, FUN = mean)
g5 < getG(X = geno(bg), center = centers)
# Only use the first 50 individuals (useful to account for population structure)
g6 < getG(X = geno(bg), i = 1:50)
# Only use the first 100 markers (useful to ignore some markers)
g7 < getG(X = geno(bg), j = 1:100)
# Compute unscaled G matrix by combining blocks of $XX_{i2}'$ where $X_{i2}$ is
# a horizontal partition of X. This is useful for distributed computing as each
# block can be computed in parallel. Centers and scales need to be precomputed.
block1 < getG(X = geno(bg), i2 = 1:100, center = centers, scale = scales)
block2 < getG(X = geno(bg), i2 = 101:199, center = centers, scale = scales)
g8 < cbind(block1, block2)
# Compute unscaled G matrix by combining blocks of $X_{i}X_{i2}'$ where both
# $X_{i}$ and $X_{i2}$ are horizontal partitions of X. Similarly to the example
# above, this is useful for distributed computing, in particular to compute
# very large G matrices. Centers and scales need to be precomputed. This
# approach is similar to the one taken by the symDMatrix package, but the
# symDMatrix package adds memorymapped blocks, only stores the upper side of
# the triangular matrix, and provides a type that allows for indexing as if the
# full G matrix is in memory.
block11 < getG(X = geno(bg), i = 1:100, i2 = 1:100, center = centers, scale = scales)
block12 < getG(X = geno(bg), i = 1:100, i2 = 101:199, center = centers, scale = scales)
block21 < getG(X = geno(bg), i = 101:199, i2 = 1:100, center = centers, scale = scales)
block22 < getG(X = geno(bg), i = 101:199, i2 = 101:199, center = centers, scale = scales)
g9 < rbind(
cbind(block11, block12),
cbind(block21, block22)
)