binspwc {binsreg} | R Documentation |
Data-Driven Pairwise Group Comparison using Binscatter Methods
Description
binspwc
implements hypothesis testing procedures for pairwise group comparison of binscatter estimators
and plots confidence bands for the difference in binscatter parameters between each pair of groups, following the
results in Cattaneo, Crump, Farrell and Feng (2024a) and
Cattaneo, Crump, Farrell and Feng (2024b).
If the binning scheme is not set by the user, the companion function
binsregselect
is used to implement binscatter in a data-driven way. Binned scatter plots based on different methods
can be constructed using the companion functions binsreg
, binsqreg
or binsglm
.
Hypothesis testing for parametric functional forms of and shape restrictions on the regression function of interest can
be conducted via the companion function binstest
.
Usage
binspwc(y, x, w = NULL, data = NULL, estmethod = "reg",
family = gaussian(), quantile = NULL, deriv = 0, at = NULL,
nolink = F, by = NULL, pwc = NULL, testtype = "two-sided",
lp = Inf, bins = NULL, bynbins = NULL, binspos = "qs",
pselect = NULL, sselect = NULL, binsmethod = "dpi", nbinsrot = NULL,
samebinsby = FALSE, randcut = NULL, nsims = 500, simsgrid = 20,
simsseed = NULL, vce = NULL, cluster = NULL, asyvar = F,
dfcheck = c(20, 30), masspoints = "on", weights = NULL,
subset = NULL, numdist = NULL, numclust = NULL, estmethodopt = NULL,
plot = FALSE, dotsngrid = 0, plotxrange = NULL, plotyrange = NULL,
colors = NULL, symbols = NULL, level = 95, ...)
Arguments
y |
outcome variable. A vector. |
x |
independent variable of interest. A vector. |
w |
control variables. A matrix, a vector or a |
data |
an optional data frame containing variables used in the model. |
estmethod |
estimation method. The default is |
family |
a description of the error distribution and link function to be used in the generalized linear model when |
quantile |
the quantile to be estimated. A number strictly between 0 and 1. |
deriv |
derivative order of the regression function for estimation, testing and plotting.
The default is |
at |
value of |
nolink |
if true, the function within the inverse link function is reported instead of the conditional mean function for the outcome. |
by |
a vector containing the group indicator for subgroup analysis; both numeric and string variables
are supported. When |
pwc |
a vector or a logical value. If |
testtype |
type of pairwise comparison test. The default is |
lp |
an Lp metric used for pairwise comparison tests. The default is |
bins |
A vector. If |
bynbins |
a vector of the number of bins for partitioning/binning of |
binspos |
position of binning knots. The default is |
pselect |
vector of numbers within which the degree of polynomial |
sselect |
vector of numbers within which the number of smoothness constraints |
binsmethod |
method for data-driven selection of the number of bins. The default is |
nbinsrot |
initial number of bins value used to construct the DPI number of bins selector. If not specified, the data-driven ROT selector is used instead. |
samebinsby |
if true, a common partitioning/binning structure across all subgroups specified by the option |
randcut |
upper bound on a uniformly distributed variable used to draw a subsample for bins/degree/smoothness selection.
Observations for which |
nsims |
number of random draws for hypothesis testing. The default is
|
simsgrid |
number of evaluation points of an evenly-spaced grid within each bin used for evaluation of
the supremum (infimum or Lp metric) operation needed to construct hypothesis testing
procedures. The default is |
simsseed |
seed for simulation. |
vce |
procedure to compute the variance-covariance matrix estimator. For least squares regression and generalized linear regression, the allowed options are the same as that for |
cluster |
cluster ID. Used for compute cluster-robust standard errors. |
asyvar |
if true, the standard error of the nonparametric component is computed and the uncertainty related to control
variables is omitted. Default is |
dfcheck |
adjustments for minimum effective sample size checks, which take into account number of unique
values of |
masspoints |
how mass points in
|
weights |
an optional vector of weights to be used in the fitting process. Should be |
subset |
optional rule specifying a subset of observations to be used. |
numdist |
number of distinct values for selection. Used to speed up computation. |
numclust |
number of clusters for selection. Used to speed up computation. |
estmethodopt |
a list of optional arguments used by |
plot |
if true, the confidence bands for all pairwise group comparisons (the difference between each pair of groups) are plotted.
The degree and smoothness of polynomials used to construct the bands are the same as those specified for testing. The default is |
dotsngrid |
number of dots to be added to the plot for confidence bands. Given the choice, these dots are point estimates of the difference between groups
evaluated over an evenly-spaced grid within the common support of all groups. The default is |
plotxrange |
a vector. |
plotyrange |
a vector. |
colors |
an ordered list of colors for plotting the difference between each pair of groups. |
symbols |
an ordered list of symbols for plotting the difference between each pair of groups. |
level |
nominal confidence level for confidence band estimation. Default is |
... |
optional arguments to control bootstrapping if |
Value
stat |
A matrix. Each row corresponds to the comparison between two groups. The first column is the test statistic. The second and third columns give the corresponding group numbers.
The null hypothesis is |
pval |
A vector of p-values for all pairwise group comparisons. |
bins_plot |
A |
data.plot |
A list containing data for plotting. Each item is a sublist of data frames for comparison between each pair of groups. Each sublist may contain the following data frames:
|
cval.cb |
A vector of critical values for all pairwise group comparisons. |
imse.var.rot |
Variance constant in IMSE expansion, ROT selection. |
imse.bsq.rot |
Bias constant in IMSE expansion, ROT selection. |
imse.var.dpi |
Variance constant in IMSE expansion, DPI selection. |
imse.bsq.dpi |
Bias constant in IMSE expansion, DPI selection. |
opt |
A list containing options passed to the function, as well as |
Author(s)
Matias D. Cattaneo, Princeton University, Princeton, NJ. cattaneo@princeton.edu.
Richard K. Crump, Federal Reserve Bank of New York, New York, NY. richard.crump@ny.frb.org.
Max H. Farrell, UC Santa Barbara, Santa Barbara, CA. mhfarrell@gmail.com.
Yingjie Feng (maintainer), Tsinghua University, Beijing, China. fengyingjiepku@gmail.com.
References
Cattaneo, M. D., R. K. Crump, M. H. Farrell, and Y. Feng. 2024a: On Binscatter. American Economic Review 114(5): 1488-1514.
Cattaneo, M. D., R. K. Crump, M. H. Farrell, and Y. Feng. 2024b: Nonlinear Binscatter Methods. Working Paper.
Cattaneo, M. D., R. K. Crump, M. H. Farrell, and Y. Feng. 2024c: Binscatter Regressions. Working Paper.
See Also
binsreg
, binsqreg
, binsglm
, binsregselect
, binstest
.
Examples
x <- runif(500); y <- sin(x)+rnorm(500); t <- 1*(runif(500)>0.5)
## Binned scatterplot
binspwc(y,x, by=t)