rrtest_clust {RRI} | R Documentation |
Residual randomization test under cluster invariances
Description
This function tests the specified linear hypothesis in model
assuming that the errors have some form of cluster invariance determined by type
within the clusters determined by clustering
.
Usage
rrtest_clust(
model,
type,
clustering = NULL,
num_R = 999,
alpha = 0.05,
val_type = "decision"
)
Arguments
model |
Regression model and hypothesis. See example_model for details. |
type |
A |
clustering |
A |
num_R |
Number of test statistic values to calculate in the test. |
alpha |
Nominal test level (between 0 to 1). |
val_type |
The type of return value. |
Details
For the regression y = X * beta + e, this function is testing the following linear null hypothesis:
H0: lam' beta = lam[1] * beta[1] + ... + lam[p] * beta[p] = lam0,
where y, X, lam, lam0 are specified in model
.
The assumption is that the errors, e, have some form of cluster invariance.
Specifically:
If
type
= "perm" then the errors are assumed exchangeable within the specified clusters:(e_1, e_2, ..., e_n) ~ cluster_perm(e_1, e_2, ..., e_n),
where ~ denotes equality in distribution, and cluster_perm is any random permutation within the clusters defined by
clustering
. Internally, the test repeatedly calculates a test statistic by randomly permuting the residuals within clusters.If
type
= "sign" then the errors are assumed sign-symmetric within the specified clusters:(e_1, e_2, ..., e_n) ~ cluster_signs(e_1, e_2, ..., e_n),
where cluster_signs is a random signs flip of residuals on the cluster level. Internally, the test repeatedly calculates a test statistic by randomly flipping the signs of cluster residuals.
If
type
= "double" then the errors are assumed both exchangeable and sign symmetric within the specified clusters:(e_1, e_2, ..., e_n) ~ cluster_signs(cluster_perm(e_1, e_2, ..., e_n)),
Internally, the test repeatedly calculates a test statistic by permuting and randomly flipping the signs of residuals on the cluster level.
Value
If val_type
= "decision" (default) we get the test binary decision (1=REJECT H0).
If val_type
= "pval" we get the test p-value.
If val_type
= "full" we get the full test output, i.e., a List
with elements tobs
, tvals
,
the observed and randomization values of the test statistic, respectively.
Note
If clustering
is NULL then it will be assigned a default value:
-
list(1:n)
iftype
= "perm", where n is the number of datapoints; -
as.list(1:n)
iftype
= "sign" or "double".
As in bootstrap num_R
is usually between 1000-5000.
See Also
Life after bootstrap: residual randomization inference in regression models (Toulis, 2019)
https://www.ptoulis.com/residual-randomization
Examples
# 1. Validity example
set.seed(123)
n = 50
X = cbind(rep(1, n), 1:n/n)
beta = c(0, 0)
rej = replicate(200, {
y = X %*% beta + rt(n, df=5)
model = list(y=y, X=X, lam=c(0, 1), lam0=0) # H0: beta2 = 0
rrtest_clust(model, "perm")
})
mean(rej) # Should be ~ 5% since H0 is true.
# 2. Heteroskedastic example
set.seed(123)
n = 200
X = cbind(rep(1, n), 1:n/n)
beta = c(-1, 0.2)
ind = c(rep(0, 0.9*n), rep(1, .1*n)) # cluster indicator
y = X %*% beta + rnorm(n, sd= (1-ind) * 0.1 + ind * 5) # heteroskedastic
confint(lm(y ~ X + 0)) # normal OLS does not reject H0: beta2 = 0
cl = list(which(ind==0), which(ind==1))
model = list(y=y, X=X, lam=c(0, 1), lam0=0)
rrtest_clust(model, "sign") # errors are sign symmetric regardless of cluster.
# Cluster sign test does not reject because of noise.
rrtest_clust(model, "perm", cl) # errors are exchangeable within clusters
# Cluster permutation test rejects because inference is sharper.