Hotelling's multivariate version of the 2 sample t-test for Euclidean data {mvhtests} | R Documentation |
Hotelling's multivariate version of the 2 sample t-test for Euclidean data
Description
Hotelling's test for testing the equality of two Euclidean population mean vectors.
Usage
hotel2T2(x1, x2, a = 0.05, R = 999, graph = FALSE)
Arguments
x1 |
A matrix containing the Euclidean data of the first group. |
x2 |
A matrix containing the Euclidean data of the second group. |
a |
The significance level, set to 0.05 by default. |
R |
If R is 1 no bootstrap calibration is performed and the classical p-value via the F distribution is returned. If R is greater than 1, the bootstrap p-value is returned. |
graph |
A boolean variable which is taken into consideration only when bootstrap calibration is performed. IF TRUE the histogram of the bootstrap test statistic values is plotted. |
Details
The fist case scenario is when we assume equality of the two covariance matrices. This is called the two-sample Hotelling's T^2
test (Mardia, Kent and Bibby, 1979, pg. 131-140) and Everitt (2005, pg. 139). The test statistic is defined as
T^2=\frac{n_1n_2}{n_1+n_2}\left(\bar{{\bf X}}_1- \bar{{\bf X}}_2\right)^T{\bf S}^{-1}\left(\bar{{\bf X}}_1- \bar{{\bf X}}_2\right),
where \bf S
is the pooled covariance matrix calculated under the assumption of equal covariance matrices
{\bf S}=\frac{\left(n_1-1\right){\bf S}_1+\left(n_2-1\right){\bf S}_2}{n_1+n_2-2}.
Under H_0
the statistic F
given by
F=\frac{\left( n_1+n_2-p-1 \right)T^2}{\left(n_1+n_2-2 \right)p}
follows the F
distribution with p
and n_1+n_2-p-1
degrees of freedom. Similar to the one-sample test, an extra argument (R) indicates whether bootstrap calibration should be used or not. If R=1, then the asymptotic theory applies, if R>1, then the bootstrap p-value will be applied and the number of re-samples is equal to R. The estimate of the common mean used in the bootstrap to transform the data under the null hypothesis the mean vector of the combined sample, of all the observations.
The built-in command manova
does the same thing exactly. Try it, the asymptotic F
test is what you have to see. In addition, this command allows for more mean vector hypothesis testing for more than two groups. I noticed this command after I had written my function and nevertheless as I mention in the introduction this document has an educational character as well.
Value
A list including:
mesoi |
The two mean vectors. |
info |
The test statistic, the p-value, the critical value and the degrees of freedom of the F distribution (numerator and denominator). This is given if no bootstrap calibration is employed. |
pvalue |
The bootstrap p-value is bootstrap is employed. |
note |
A message informing the user that bootstrap calibration has been employed. |
runtime |
The runtime of the bootstrap calibration. |
Author(s)
Michail Tsagris.
R implementation and documentation: Michail Tsagris mtsagris@uoc.gr.
References
Everitt B. (2005). An R and S-Plus Companion to Multivariate Analysis. Springer.
Mardia K.V., Kent J.T. and Bibby J.M. (1979). Multivariate Analysis. London: Academic Press.
Tsagris M., Preston S. and Wood A.T.A. (2017). Nonparametric hypothesis testing for equality of means on the simplex. Journal of Statistical Computation and Simulation, 87(2): 406–422.
See Also
james, maov, el.test2, eel.test2
Examples
hotel2T2( as.matrix(iris[1:25, 1:4]), as.matrix(iris[26:50, 1:4]) )
hotel2T2( as.matrix(iris[1:25, 1:4]), as.matrix(iris[26:50, 1:4]), R = 1 )