BLlo {MGBT} | R Documentation |
Barnett and Lewis Test Adjusted for Low Outliers
Description
The Barnett and Lewis (1995, p. 224; T_{\mathrm{N}3}
) so-labeled “N3 method” with TAC adjustment to look for low outliers. The essence of the method, given the order statistics x_{[1:n]} \le x_{[2:n]} \le \cdots \le x_{[(n-1):n]} \le x_{[n:n]}
, is the statistic
BL_r = T_{\mathrm{N}3} =
\frac{ \sum_{i=1}^r x_{[i:n]} - r \times \mathrm{mean}\{x_{[1:n]}\} }
{\sqrt{\mathrm{var}\{x_{[1:n]}\}}}\mbox{,}
for the mean and variance of the observations. Barnett and Lewis (1995, p. 218) brand this statistic as a test of the “k \ge 2
upper outliers” but for the MGBT package “lower” applies in TAC reformulation. Barnett and Lewis (1995, p. 218) show an example of a modification for two low outliers as (2\overline{x} - x_{[2:n]} - x_{[1:n]})/s
for the mean \mu
and standard deviation s
. TAC reformulation thus differs by a sign. The BL_r
is a sum of internally studentized deviations from the mean:
SP(t) \le {n \choose k} P\biggl(\bm{t}(n-2) > \biggr[\frac{n(n-2)t^2}{r(n-r)(n-1)-nt^2}\biggl]^{1/2}\biggr)\mbox{,}
where \bm{t}(df)
is the t-distribution for df
degrees of freedom, and this is an inequality when
t \ge \sqrt{r^2(n-1)(n-r-1)/(nr+n)}\mbox{,}
where SP(t)
is the probability that T_{\mathrm{N}3} > t
when the inequality holds. For reference, Barnett and Lewis (1995, p. 491) example tables of critical values for n=10
for k \in 2,3,4
at 5-percent significant level are 3.18
, 3.82
, and 4.17
, respectively. One of these is evaluated in the Examples.
Usage
BLlo(x, r, n=length(x))
Arguments
x |
The data values and note that base-10 logarithms of these are not computed internally; |
r |
The number of truncated observations; and |
n |
The number of observations. |
Value
The value for BL_r
.
Note
Regarding n=length(x)
, it is not clear that TAC intended n
to be not equal to the sample size. TAC chose to not determine the length of x
internally to the function but to have it available as an argument. Also MGBTcohn2011
and RSlo
were designed similarly.
Author(s)
W.H. Asquith consulting T.A. Cohn sources
Source
LowOutliers_jfe(R).txt
and LowOutliers_wha(R).txt
—Named BL_N3
References
Barnett, Vic, and Lewis, Toby, 1995, Outliers in statistical data: Chichester, John Wiley and Sons, ISBN~0–471–93094–6.
Cohn, T.A., 2013–2016, Personal communication of original R source code: U.S. Geological Survey, Reston, Va.
See Also
Examples
# See Examples under RSlo()
# WHA experiments with BL_r()
n <- 10; r <- 3; nsim <- 10000; alpha <- 0.05; Tcrit <- 3.82
BLs <- Ho <- RHS <- SPt <- rep(NA, nsim)
EQ <- sqrt(r^2*(n-1)*(n-r-1)/(n*r+n))
for(i in 1:nsim) { # some simulation results shown below
BLs[i] <- abs(BLlo(rnorm(n), r)) # abs() correcting TAC sign convention
t <- sqrt( (n*(n-2)*BLs[i]^2) / (r*(n-r)*(n-1)-n*BLs[i]^2) )
RHS[i] <- choose(n,r)*pt(t, n-2, lower.tail=FALSE)
ifelse(t >= EQ, SPt[i] <- RHS[i], SPt[i] <- 1) # set SP(t) to unity?
Ho[i] <- BLs[i] > Tcrit
}
results <- c(quantile(BLs, prob=1-alpha), sum(Ho /nsim), sum(SPt < alpha)/nsim)
names(results) <- c("Critical_value", "Ho_rejected", "Coverage_SP(t)")
print(results) # minor differences are because of random number seeding
# Critical_value Ho_rejected Coverage_SP(t)
# 3.817236 0.048200 0.050100