bigQF-package {bigQF} | R Documentation |
Quadratic Forms in Large Matrices
Description
A computationally-efficient leading-eigenvalue approximation to tail probabilities and quantiles of large quadratic forms, in particular for the Sequence Kernel Association Test (SKAT) used in genomics <doi:10.1002/gepi.22136>. Also provides stochastic singular value decomposition for dense or sparse matrices.
Details
The DESCRIPTION file:
Package: | bigQF |
Type: | Package |
Title: | Quadratic Forms in Large Matrices |
Version: | 1.6 |
Author: | Thomas Lumley |
Maintainer: | Thomas Lumley <t.lumley@auckland.ac.nz> |
Description: | A computationally-efficient leading-eigenvalue approximation to tail probabilities and quantiles of large quadratic forms, in particular for the Sequence Kernel Association Test (SKAT) used in genomics <doi:10.1002/gepi.22136>. Also provides stochastic singular value decomposition for dense or sparse matrices. |
URL: | https://github.com/tslumley/bigQF |
Imports: | svd, CompQuadForm, Matrix, stats, coxme |
Suggests: | knitr, rmarkdown, SKAT |
VignetteBuilder: | knitr |
Depends: | methods |
License: | GPL-2 |
Index of help topics:
SKAT.example Data example from SKAT package SKAT.matrixfree Make 'matrix-free' object for SKAT test bigQF-package Quadratic Forms in Large Matrices famSKAT Implicit matrix for family-based SKAT test pQF Tail probabilities for quadratic forms seigen Stochastic singular value decomposition seqMetaExample Example data, from seqMeta package sequence Simulated human DNA variant sequence sparse.matrixfree Make 'matrix-free' object from (sparse) Matrix
This package computes tail probabilities for large quadratic forms, with the motivation being the SKAT test used in DNA sequence association studies.
The true distribution is a linear combination of 1-df chi-squared
distributions, where the coefficients are the non-zero eigenvalues of
the matrix A
defining the quadratic form z^TAz
. The package uses an
approximation to the distribution consisting of the largest neig
terms in the
linear combination plus the Satterthwaite approximation to the rest of
the linear combination.
The main function is pQF
, which has options for how to
compute the leading eigenvalues (Lanczos-type algorithm or stochastic
SVD) and how to compute the linear combination (inverting the
characteristic function or a saddlepoint approximation). The Lanczos
algorithm is from the svd
package; the stochastic SVD can be
called directly via ssvd
or seigen
Given a square matrix, pQF
uses it as A
. If the input is a
non-square matrix M
, then A
is crossprod(M)
. The
function can also be used matrix-free, given an object containing
functions to compute the product and transpose-product by M
. This last option
is described in the "matrix-free"
vignette. The matrix-free
algorithm also uses a randomised estimator to estimate
the trace of crossprod(A)
. The function sparse.matrixfree
constructs a object for
matrix-free use of pQF
from a sparse Matrix object. The
algorithms are described in the Lumley et al (2018) reference.
Finally, there are functions specifically for the SKAT family of genomic
tests. These take a genotype matrix and an adjustment model as arguments
and produce an object that contains the test statistic in its
Q
component and which can be used as an argument to pQF
to
extract p-values: SKAT.matrixfree
and famSKAT
. The
vignette "Checking pQF vs SKAT"
compares SKAT.matrixfree
to the SKAT
package and illustrates how it can be used
Author(s)
Thomas Lumley
Maintainer: Thomas Lumley <t.lumley@auckland.ac.nz>
References
Tong Chen, Thomas Lumley (2019) Numerical evaluation of methods approximating the distribution of a large quadratic form in normal variables. Computational Statistics & Data Analysis. 139: 75-81,
Lumley et al. (2018) Sequence kernel association tests for large sets of markers: tail probabilities for large quadratic forms. Genet Epidemiol . 2018 Sep;42(6):516-527. doi: 10.1002/gepi.22136
Nathan Halko, Per-Gunnar Martinsson, Joel A. Tropp (2010) Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions. https://arxiv.org/abs/0909.4061.
Lee, S., with contributions from Larisa Miropolsky, and Wu, M. (2015). SKAT: SNP-Set (Sequence) Kernel Association Test. R package version 1.1.2.
Lee, S., Wu, M. C., Cai, T., Li, Y., Boehnke, M., and Lin, X. (2011). Rare-variant association testing for sequencing data with the sequence kernel association test. American Journal of Human Genetics, 89:82-93.