hypotheses {marginaleffects} | R Documentation |
(Non-)Linear Tests for Null Hypotheses, Joint Hypotheses, Equivalence, Non Superiority, and Non Inferiority
Description
Uncertainty estimates are calculated as first-order approximate standard errors for linear or non-linear functions of a vector of random variables with known or estimated covariance matrix. In that sense, hypotheses
emulates the behavior of the excellent and well-established car::deltaMethod and car::linearHypothesis functions, but it supports more models; requires fewer dependencies; expands the range of tests to equivalence and superiority/inferiority; and offers convenience features like robust standard errors.
To learn more, read the hypothesis tests vignette, visit the package website, or scroll down this page for a full list of vignettes:
Warning #1: Tests are conducted directly on the scale defined by the type
argument. For some models, it can make sense to conduct hypothesis or equivalence tests on the "link"
scale instead of the "response"
scale which is often the default.
Warning #2: For hypothesis tests on objects produced by the marginaleffects
package, it is safer to use the hypothesis
argument of the original function. Using hypotheses()
may not work in certain environments, in lists, or when working programmatically with *apply style functions.
Warning #3: The tests assume that the hypothesis
expression is (approximately) normally distributed, which for non-linear functions of the parameters may not be realistic. More reliable confidence intervals can be obtained using the inferences()
function with method = "boot"
.
Usage
hypotheses(
model,
hypothesis = NULL,
vcov = NULL,
conf_level = 0.95,
df = NULL,
equivalence = NULL,
joint = FALSE,
joint_test = "f",
numderiv = "fdforward",
...
)
Arguments
model |
Model object or object generated by the |
hypothesis |
specify a hypothesis test or custom contrast using a numeric value, vector, or matrix; a string equation; string; a formula, or a function.
|
vcov |
Type of uncertainty estimates to report (e.g., for robust standard errors). Acceptable values:
|
conf_level |
numeric value between 0 and 1. Confidence level to use to build a confidence interval. |
df |
Degrees of freedom used to compute p values and confidence intervals. A single numeric value between 1 and |
equivalence |
Numeric vector of length 2: bounds used for the two-one-sided test (TOST) of equivalence, and for the non-inferiority and non-superiority tests. See Details section below. |
joint |
Joint test of statistical significance. The null hypothesis value can be set using the
|
joint_test |
A character string specifying the type of test, either "f" or "chisq". The null hypothesis is set by the |
numderiv |
string or list of strings indicating the method to use to for the numeric differentiation used in to compute delta method standard errors.
|
... |
Additional arguments are passed to the |
Joint hypothesis tests
The test statistic for the joint Wald test is calculated as (R * theta_hat - r)' * inv(R * V_hat * R') * (R * theta_hat - r) / Q, where theta_hat is the vector of estimated parameters, V_hat is the estimated covariance matrix, R is a Q x P matrix for testing Q hypotheses on P parameters, r is a Q x 1 vector for the null hypothesis, and Q is the number of rows in R. If the test is a Chi-squared test, the test statistic is not normalized.
The p-value is then calculated based on either the F-distribution (for F-test) or the Chi-squared distribution (for Chi-squared test). For the F-test, the degrees of freedom are Q and (n - P), where n is the sample size and P is the number of parameters. For the Chi-squared test, the degrees of freedom are Q.
Equivalence, Inferiority, Superiority
\theta
is an estimate, \sigma_\theta
its estimated standard error, and [a, b]
are the bounds of the interval supplied to the equivalence
argument.
Non-inferiority:
-
H_0
:\theta \leq a
-
H_1
:\theta > a
-
t=(\theta - a)/\sigma_\theta
p: Upper-tail probability
Non-superiority:
-
H_0
:\theta \geq b
-
H_1
:\theta < b
-
t=(\theta - b)/\sigma_\theta
p: Lower-tail probability
Equivalence: Two One-Sided Tests (TOST)
p: Maximum of the non-inferiority and non-superiority p values.
Thanks to Russell V. Lenth for the excellent emmeans
package and documentation which inspired this feature.
Examples
library(marginaleffects)
mod <- lm(mpg ~ hp + wt + factor(cyl), data = mtcars)
hypotheses(mod)
# Test of equality between coefficients
hypotheses(mod, hypothesis = "hp = wt")
# Non-linear function
hypotheses(mod, hypothesis = "exp(hp + wt) = 0.1")
# Robust standard errors
hypotheses(mod, hypothesis = "hp = wt", vcov = "HC3")
# b1, b2, ... shortcuts can be used to identify the position of the
# parameters of interest in the output of
hypotheses(mod, hypothesis = "b2 = b3")
# wildcard
hypotheses(mod, hypothesis = "b* / b2 = 1")
# term names with special characters have to be enclosed in backticks
hypotheses(mod, hypothesis = "`factor(cyl)6` = `factor(cyl)8`")
mod2 <- lm(mpg ~ hp * drat, data = mtcars)
hypotheses(mod2, hypothesis = "`hp:drat` = drat")
# predictions(), comparisons(), and slopes()
mod <- glm(am ~ hp + mpg, data = mtcars, family = binomial)
cmp <- comparisons(mod, newdata = "mean")
hypotheses(cmp, hypothesis = "b1 = b2")
mfx <- slopes(mod, newdata = "mean")
hypotheses(cmp, hypothesis = "b2 = 0.2")
pre <- predictions(mod, newdata = datagrid(hp = 110, mpg = c(30, 35)))
hypotheses(pre, hypothesis = "b1 = b2")
# The `hypothesis` argument can be used to compute standard errors for fitted values
mod <- glm(am ~ hp + mpg, data = mtcars, family = binomial)
f <- function(x) predict(x, type = "link", newdata = mtcars)
p <- hypotheses(mod, hypothesis = f)
head(p)
f <- function(x) predict(x, type = "response", newdata = mtcars)
p <- hypotheses(mod, hypothesis = f)
head(p)
# Complex aggregation
# Step 1: Collapse predicted probabilities by outcome level, for each individual
# Step 2: Take the mean of the collapsed probabilities by group and `cyl`
library(dplyr)
library(MASS)
library(dplyr)
dat <- transform(mtcars, gear = factor(gear))
mod <- polr(gear ~ factor(cyl) + hp, dat)
aggregation_fun <- function(x) {
predictions(x, vcov = FALSE) |>
mutate(group = ifelse(group %in% c("3", "4"), "3 & 4", "5")) |>
summarize(estimate = sum(estimate), .by = c("rowid", "cyl", "group")) |>
summarize(estimate = mean(estimate), .by = c("cyl", "group")) |>
rename(term = cyl)
}
hypotheses(mod, hypothesis = aggregation_fun)
# Equivalence, non-inferiority, and non-superiority tests
mod <- lm(mpg ~ hp + factor(gear), data = mtcars)
p <- predictions(mod, newdata = "median")
hypotheses(p, equivalence = c(17, 18))
mfx <- avg_slopes(mod, variables = "hp")
hypotheses(mfx, equivalence = c(-.1, .1))
cmp <- avg_comparisons(mod, variables = "gear", hypothesis = "pairwise")
hypotheses(cmp, equivalence = c(0, 10))
# joint hypotheses: character vector
model <- lm(mpg ~ as.factor(cyl) * hp, data = mtcars)
hypotheses(model, joint = c("as.factor(cyl)6:hp", "as.factor(cyl)8:hp"))
# joint hypotheses: regular expression
hypotheses(model, joint = "cyl")
# joint hypotheses: integer indices
hypotheses(model, joint = 2:3)
# joint hypotheses: different null hypotheses
hypotheses(model, joint = 2:3, hypothesis = 1)
hypotheses(model, joint = 2:3, hypothesis = 1:2)
# joint hypotheses: marginaleffects object
cmp <- avg_comparisons(model)
hypotheses(cmp, joint = "cyl")