nb.glm.test {NBPSeq} | R Documentation |
Fit Negative Binomial Regression Model and Test for a Regression Coefficient
Description
For each row of the input data matrix, nb.glm.test
fits an NB log-linear regression model and performs
large-sample tests for a one-dimensional regression
coefficient.
Usage
nb.glm.test(counts, x, beta0, lib.sizes = colSums(counts),
normalization.method = "AH2010", dispersion.model = "NBQ",
tests = c("HOA", "LR", "Wald"), alternative = "two.sided",
subset = 1:dim(counts)[1])
Arguments
counts |
an m by n matrix of RNA-Seq read counts with rows corresponding to gene features and columns corresponding to independent biological samples. |
x |
an n by p design matrix specifying the treatment structure. |
beta0 |
a p-vector specifying the null hypothesis. Non-NA components specify the parameters to test and their null values. |
lib.sizes |
a p-vector of observed library sizes, usually (and by default) estimated by column totals. |
normalization.method |
a character string specifying
the method for estimating the normalization factors, can
be |
dispersion.model |
a character string specifying the dispersion model, and can be one of "NB2", "NBP", "NBQ" (default), "NBS" or "step". |
tests |
a character string vector specifying the
tests to be performed, can be any subset of |
alternative |
a character string specifying the
alternative hypothesis, must be one of |
subset |
specify a subset of rows to perform the test on |
Details
nbp.glm.test
provides a simple, one-stop interface
to performing a series of core tasks in regression analysis
of RNA-Seq data: it calls
estimate.norm.factors
to estimate
normalization factors; it calls
prepare.nb.data
to create an NB data
structure; it calls estimate.dispersion
to
estimate the NB dispersion; and it calls
test.coefficient
to test the regression
coefficient.
To keep the interface simple, nbp.glm.test
provides
limited options for fine tuning models/parameters in each
individual step. For more control over individual steps,
advanced users can call
estimate.norm.factors
,
prepare.nb.data
,
estimate.dispersion
, and
test.coefficient
directly, or even substitute
one or more of them with their own versions.
Value
A list containing the following components:
data |
a
list containing the input data matrix with additional
summary quantities, output from
|
dispersion |
dispersion estimates and models, output
from |
test |
test
results, output from |
Examples
## Load Arabidopsis data
data(arab);
## Specify treatment structure
grp.ids = as.factor(c(1, 1, 1, 2, 2, 2));
x = model.matrix(~grp.ids);
## Specify the null hypothesis
## The null hypothesis is beta[1]=0 (beta[1] is the log fold change).
beta0 = c(NA, 0);
## Fit NB regression model and perform large sample tests.
## The step can take long if the number of genes is large
fit = nb.glm.test(arab, x, beta0, subset=1:50);
## The result contains the data, the dispersion estimates and the test results
print(str(fit));
## Show HOA test results for top ten genes
subset = order(fit$test.results$HOA$p.values)[1:10];
cbind(fit$data$counts[subset,], fit$test.results$HOA[subset,]);
## Show LR test results
subset = order(fit$test.results$LR$p.values)[1:10];
cbind(fit$data$counts[subset,], fit$test.results$LR[subset,]);