saturated_init {robust2sls} | R Documentation |
Saturated 2SLS (split-sample initial estimator)
Description
saturated_init
splits the sample into two sub-samples. The 2SLS model
is estimated on both sub-samples and the estimates of one sub-sample are
used to calculate the residuals and hence outliers from the other sub-sample.
Usage
saturated_init(data, formula, cutoff, shuffle, shuffle_seed, split = 0.5)
Arguments
data |
A dataframe. |
formula |
A formula in the format |
cutoff |
A numeric cutoff value used to judge whether an observation is an outlier or not. If its absolute value is larger than the cutoff value, the observations is classified as an outlier. |
shuffle |
A logical value ( |
shuffle_seed |
A numeric value that sets the seed for shuffling the
data set before splitting it. Only used if |
split |
A numeric value strictly between 0 and 1 that determines in which proportions the sample will be split. |
Value
saturated_init
returns a list with five elements. The first
four are vectors whose length equals the number of observations in the data
set. Unlike the residuals stored in a model object (usually accessible via
model$residuals
), it does not ignore observations where any of y, x
or z are missing. It instead sets their values to NA
.
The first element is a double vector containing the residuals for each
observation based on the model estimates. The second element contains the
standardised residuals, the third one a logical vector with TRUE
if
the observation is judged as not outlying, FALSE
if it is an outlier,
and NA
if any of y, x, or z are missing. The fourth element of the
list is an integer vector with three values: 0 if the observations is judged
to be an outlier, 1 if not, and -1 if missing. The fifth and last element
is a list with the two initial ivreg
model objects based
on the two different sub-samples.
Warning
The estimator may have bad properties if the split
is too unequal and
the sample size is not large enough.