m_test {robnptests} | R Documentation |
Two sample location test based on M-estimators
Description
m_test
performs a two-sample location test based on an M-estimator.
Usage
m_test(
x,
y,
alternative = c("two.sided", "greater", "less"),
delta = ifelse(scale.test, 1, 0),
method = c("asymptotic", "permutation", "randomization"),
psi = c("huber", "hampel", "bisquare"),
k = robustbase::.Mpsi.tuning.default(psi),
n.rep = 10000,
na.rm = FALSE,
scale.test = FALSE,
wobble.seed = NULL,
...
)
Arguments
x |
a (non-empty) numeric vector of data values. |
y |
a (non-empty) numeric vector of data values. |
alternative |
a character string specifying the alternative hypothesis, must be one of "two.sided" (default), "greater", or "less". |
delta |
a numeric value indicating the true difference in the location or
scale parameter, depending on whether the test should be performed
for a difference in location or in scale. The default is
|
method |
a character string specifying how the p-value is computed with
possible values |
psi |
kernel used for optimization.
Must be one of |
k |
tuning parameter(s) for the respective kernel function,
defaults to parameters implemented in |
n.rep |
an integer value specifying the number of random splits used to
calculate the randomization distribution if |
na.rm |
a logical value indicating whether NA values in |
scale.test |
a logical value to specify if the samples should be compared
for a difference in scale. The default is |
wobble.seed |
an integer value used as a seed for the random number
generation in case that |
... |
additional arguments |
Details
The test statistic for this test is based on the difference of the M-estimates
of location of x
and y
, see m_est
.
Three different psi-functions can be used: huber
, hampel
, and
bisquare
. The corresponding tuning parameter(s) can be set by the
argument k
of the function.
The estimate for the location difference is scaled by a pooled estimate for
the standard deviation. This estimate is based on the
tau-estimate of scale and is computed with the default parameter settings
of the function scaleTau2
. These can be changed if
by setting c1
and c2
.
More details on the construction of the test statistic are given in the
vignettes vignette("robnptests")
and
vignette("m_tests")
.
Three versions of the test are implemented: randomization, permutation, and asymptotic.
The randomization distribution is based on randomly drawn splits with
replacement. The function permp
(Phipson and Smyth 2010)
is used to calculate the p-value. The psi-function for the the M-estimate
is computed with the implementations in the package
robustbase.
For the asymptotic test, the distribution of the test statistic is approximated
by a standard normal distribution.
However, this is only justified under the normality assumption. When the
observations do not come from a normal distribution, the tests might not keep
the desired significance level. Simulations indicate that the level is kept
under symmetric distributions if the variance exists. Under skewed
distributions, it tends to be anti-conservative, see the vignette
vignette("m_tests")
. The test statistic can be corrected by a
factor which has to be determined individually for a specific distribution in
such cases.
For scale.test = TRUE
, the test compares the two samples for a difference
in scale. This is achieved by log-transforming the original squared observations,
i.e. x
is replaced by log(x^2)
and y
by log(y^2)
.
A potential scale difference then appears as a location difference between
the transformed samples, see Fried (2012).
Note that the samples need to have equal locations. The sample should not
contain zeros to prevent problems with the necessary log-transformation. If
it contains zeros, uniform noise is added to all variables in order to remove
zeros and a message is printed.
If the sample has been modified because of zeros when scale.test = TRUE
,
the modified samples can be retrieved using
set.seed(wobble.seed); wobble(x, y)
Both samples need to contain at least 5 non-missing values.
Value
A named list with class "htest
" containing the following components:
statistic |
the value of the test statistic. |
parameter |
the degrees of freedom for the test statistic. |
p.value |
the p-value for the test. |
estimate |
the M-estimates of |
null.value |
the specified hypothesized value of the mean difference/squared scale ratio. |
alternative |
a character string describing the alternative hypothesis. |
method |
a character string indicating how the p-value was computed. |
data.name |
a character string giving the names of the data. |
References
Fried R (2012). “On the online estimation of piecewise constant volatilities.” Computational Statistics & Data Analysis, 56(11), 3080–3090. doi:10.1016/j.csda.2011.02.012.
Maronna RA, Zamar RH (2002). “Robust estimates of location and dispersion of high-dimensional datasets.” Technometrics, 44(4), 307–317. doi:10.1198/004017002188618509.
Phipson B, Smyth GK (2010). “Permutation p-values should never be zero: Calculating exact p-values when permutations are randomly drawn.” Statistical Applications in Genetics and Molecular Biology, 9(1), Article 39. doi:10.2202/1544-6115.1585.
Examples
# Generate random samples
set.seed(108)
x <- rnorm(20); y <- rnorm(20)
# Asymptotic test based on Huber M-estimator
m_test(x, y, method = "asymptotic", psi = "huber")
## Not run:
# Randomization test based on Hampel M-estimator with 1000 random permutations
# drawn with replacement
m_test(x, y, method = "randomization", n.rep = 1000, psi = "hampel")
## End(Not run)