glmm.score {GMMAT} | R Documentation |
Performing GLMM based score tests
Description
Use a glmmkin class object from the null GLMM to perform score tests for association with genotypes in a plink .bed file (binary genotypes), a GDS file .gds, or a plain text file (or compressed .gz or .bz2 file).
Usage
glmm.score(obj, infile, outfile, BGEN.samplefile = NULL, center = T, select = NULL,
MAF.range = c(1e-7, 0.5), miss.cutoff = 1,
missing.method = "impute2mean", nperbatch = 100, tol = 1e-5,
infile.nrow = NULL, infile.nrow.skip = 0, infile.sep = "\t",
infile.na = "NA", infile.ncol.skip = 1, infile.ncol.print = 1,
infile.header.print = "SNP", is.dosage = FALSE, ncores = 1, verbose = FALSE)
Arguments
obj |
a class glmmkin or class glmmkin.multi object, returned by fitting the null GLMM using |
infile |
the input file name or an object of class SeqVarGDSClass. Note that for plink binary genotype files only the prefix without .bed, .bim or .fam should be used. Only SNP major mode recognized in the binary file. Alternatively, it can be the full name of a BGEN file (including the suffix .bgen), a GDS file (including the suffix .gds), or a plain text file with some delimiters (comma, space, tab or something else), with one row for each SNP and one column for each individual. In that case, SNPs should be coded as numeric values (0/1/2 or dosages allowed, A/C/G/T coding is not recognized). There can be additional rows and columns to skip at the beginning. The order of individuals can be different from |
outfile |
the output file name. |
BGEN.samplefile |
path to the BGEN sample file. Required when the BGEN file does not contain sample identifiers or the |
center |
a logical switch for centering genotypes before tests. If TRUE, genotypes will be centered to have mean 0 before tests, otherwise raw values will be directly used in tests (default = TRUE). |
select |
an optional vector indicating the order of individuals in |
MAF.range |
a numeric vector of length 2 defining the minimum and maximum minor allele frequencies of variants that should be included in the analysis (default = c(1e-7, 0.5)). |
miss.cutoff |
the maximum missing rate allowed for a variant to be included (default = 1, including all variants). |
missing.method |
method of handling missing genotypes. Either "impute2mean" or "omit" (default = "impute2mean"). |
nperbatch |
an integer for how many SNPs should be tested in a batch (default = 100). The computational time can increase dramatically if this value is either small or large. The optimal value for best performance depends on the user's system. |
tol |
the threshold for determining monomorphism. If a SNP has value range less than the tolerance, it will be considered monomorphic and its association test p-value will be NA (default = 1e-5). Only used when |
infile.nrow |
number of rows to read in |
infile.nrow.skip |
number of rows to skip at the beginning of |
infile.sep |
delimiter in |
infile.na |
symbol in |
infile.ncol.skip |
number of columns to skip before genotype data in |
infile.ncol.print |
a vector indicating which column(s) in |
infile.header.print |
a character vector indicating column name(s) of column(s) selected to print by |
is.dosage |
a logical switch for whether imputed dosage should be used from a GDS |
ncores |
a positive integer indicating the number of cores to be used in parallel computing (default = 1). |
verbose |
a logical switch for whether a progress bar should be shown for a GDS |
Value
NULL if infile
is a BGEN file (.bgen) or a GDS file (.gds), otherwise computational time in seconds, excluding I/O time.
Author(s)
Han Chen, Duy T. Pham
References
Chen, H., Wang, C., Conomos, M.P., Stilp, A.M., Li, Z., Sofer, T., Szpiro, A.A., Chen, W., Brehm, J.M., Celedón, J.C., Redline, S., Papanicolaou, G.J., Thornton, T.A., Laurie, C.C., Rice, K. and Lin, X. (2016) Control forpopulation structure and relatedness for binary traits in genetic association studies via logistic mixed models. The American Journal of Human Genetics 98, 653-666.
See Also
Examples
data(example)
attach(example)
model0 <- glmmkin(disease ~ age + sex, data = pheno, kins = GRM, id = "id",
family = binomial(link = "logit"))
plinkfiles <- strsplit(system.file("extdata", "geno.bed", package = "GMMAT"),
".bed", fixed = TRUE)[[1]]
outfile.bed <- tempfile()
glmm.score(model0, infile = plinkfiles, outfile = outfile.bed)
if(requireNamespace("SeqArray", quietly = TRUE) && requireNamespace("SeqVarTools",
quietly = TRUE)) {
infile <- system.file("extdata", "geno.gds", package = "GMMAT")
outfile.gds <- tempfile()
glmm.score(model0, infile = infile, outfile = outfile.gds)
unlink(outfile.gds)
}
infile <- system.file("extdata", "geno.txt", package = "GMMAT")
outfile.text <- tempfile()
glmm.score(model0, infile = infile, outfile = outfile.text, infile.nrow.skip = 5,
infile.ncol.skip = 3, infile.ncol.print = 1:3,
infile.header.print = c("SNP", "Allele1", "Allele2"))
infile <- system.file("extdata", "geno.bgen", package = "GMMAT")
samplefile <- system.file("extdata", "geno.sample", package = "GMMAT")
outfile.bgen <- tempfile()
glmm.score(model0, infile = infile, BGEN.samplefile = samplefile,
outfile = outfile.bgen)
unlink(c(outfile.bed, outfile.text, outfile.bgen))