rrinf_clust {RRI}R Documentation

Residual randomization inference based on cluster invariances

Description

This function is a wrapper over rrtest_clust and gives confidence intervals for all parameters assuming a particular cluster invariance on the errors.

Usage

rrinf_clust(
  y,
  X,
  type,
  clustering = NULL,
  cover = 0.95,
  num_R = 999,
  control = list(num_se = 6, num_breaks = 60)
)

Arguments

y

Vector of outcomes (length n)

X

Covariate matrix (n x p). First column should be ones to include intercept.

type

A string, either "perm", "sign" or "double".

clustering

A List that specifies a clustering of datapoint indexes 1, ..., n. See example_clustering for details.

cover

Number from [0, 1] that denotes the confidence interval coverage (e.g., 0.95 denotes 95%)

num_R

Number of test statistic values to calculate in the randomization test (similar to no. of bootstrap samples).

control

A List that controls the scope of the test inversion.

Details

This function has similar funtionality as standard confint. It generates confidence intervals by testing plausible values for each parameter. The plausible values are generated as follows. For some parameter beta_i we test successively

H0: beta_i = hat_beta_i - num_se * se_i

...up to...

H0: beta_i = hat_beta_i + num_se * se_i

broken in num_breaks intervals. Here, hat_beta_i is the OLS estimate of beta_i and se_i is the standard error. We then report the minimum and maximum values in this search space which we cannot reject at level alpha. This forms the desired confidence interval.

Value

Matrix that includes the OLS estimate, and confidence interval endpoints.

Note

If the confidence interval appears to be a point or is empty, then this means that the nulls we consider are implausible. We can try to improve the search through control.tinv. For example, we can both increase num_se to increase the width of search, and increase num_breaks to make the search space finer.

See rrtest_clust for a description of type and clustering.

See Also

Life after bootstrap: residual randomization inference in regression models (Toulis, 2019)

https://www.ptoulis.com/residual-randomization

Examples

# Heterogeneous example
set.seed(123)
n = 200
X = cbind(rep(1, n), 1:n/n)
beta = c(-1, 0.2)
ind = c(rep(0, 0.9*n), rep(1, .1*n))  # cluster indicator
y = X %*% beta + rnorm(n, sd= (1-ind) * 0.1 + ind * 5) # heteroskedastic
confint(lm(y ~ X + 0))  # normal OLS CI is imprecise

cl = list(which(ind==0), which(ind==1))  #  define the clustering
rrinf_clust(y, X, "perm", cl)  # improved CI through clustered errors


[Package RRI version 1.1 Index]