get_hotellings {disprofas} | R Documentation |
Hotelling's statistics (for two independent (small) samples)
Description
The function get_hotellings()
estimates the parameters for Hotelling's
two-sample T^2
statistic for small samples.
Usage
get_hotellings(m1, m2, signif)
Arguments
m1 |
A matrix with the data of the reference group. |
m2 |
A matrix with the same dimensions as matrix |
signif |
A positive numeric value between |
Details
The two-sample Hotelling's T^2
test statistic is given by
T^2 = \left( \bar{\bm{x}}_1 - \bar{\bm{x}}_2 \right)^{\top}
\left( \bm{S}_p \left( \frac{1}{n_1} + \frac{1}{n_2} \right) \right)^{-1}
\left( \bar{\bm{x}}_1 - \bar{\bm{x}}_2 \right) .
For large samples, this test statistic will be approximately chi-square
distributed with p
degrees of freedom. However, this approximation
does not take into account the variation due to the variance-covariance
matrix estimation. Therefore, Hotelling's T^2
statistic
is transformed into an F
-statistic using the expression
F = \frac{n_1 + n_2 - p - 1}{(n_1 + n_2 - 2) p} T^2 ,
where n_1
and n_2
are the sample sizes of the two samples being
compared and p
is the number of variables.
Under the null hypothesis, H_0: \bm{\mu}_1 = \bm{\mu}_2
, this F
-statistic will be F
-distributed
with p
and n_1 + n_2 - p
degrees of freedom. H_0
is
rejected at significance level \alpha
if the F
-value exceeds the
critical value from the F
-table evaluated at \alpha
, i.e.
F > F_{p, n_1 + n_2 - p - 1, \alpha}
. The null hypothesis is satisfied
if, and only if, the population means are identical for all variables. The
alternative is that at least one pair of these means is different.
The following assumptions concerning the data are made:
The data from population
i
is a sample from a population with mean vector\mu_i
. In other words, it is assumed that there are no sub-populations.The data from both populations have common variance-covariance matrix
\Sigma
.The subjects from both populations are independently sampled.
Both populations are normally distributed.
Value
A list with the following elements is returned:
Parameters |
Parameters determined for the estimation of
Hotelling's |
S.pool |
Pooled variance-covariance matrix. |
covs |
A list with the elements |
means |
A list with the elements |
The Parameters
element contains the following information:
DM |
Mahalanobis distance of the samples. |
df1 |
Degrees of freedom (number of variables or time points). |
df2 |
Degrees of freedom (number of rows - number of variables - 1). |
alpha |
Provided significance level. |
K |
Scaling factor for |
k |
Scaling factor for the squared Mahalanobis distance to obtain
the |
T2 |
Hotelling's |
F |
Observed |
F.crit |
Critical |
p.F |
|
References
Hotelling, H. The generalisation of Student's ratio. Ann Math Stat. 1931; 2(3): 360-378.
Hotelling, H. (1947) Multivariate quality control illustrated by air testing of sample bombsights. In: Eisenhart, C., Hastay, M.W., and Wallis, W.A., Eds., Techniques of Statistical Analysis, McGraw Hill, New York, 111-184.
See Also
Examples
# Dissolution data of one reference batch and one test batch of n = 6
# tablets each:
str(dip1)
# 'data.frame': 12 obs. of 10 variables:
# $ type : Factor w/ 2 levels "R","T": 1 1 1 1 1 1 2 2 2 2 ...
# $ tablet: Factor w/ 6 levels "1","2","3","4",..: 1 2 3 4 5 6 1 2 3 4 ...
# $ t.5 : num 42.1 44.2 45.6 48.5 50.5 ...
# $ t.10 : num 59.9 60.2 55.8 60.4 61.8 ...
# $ t.15 : num 65.6 67.2 65.6 66.5 69.1 ...
# $ t.20 : num 71.8 70.8 70.5 73.1 72.8 ...
# $ t.30 : num 77.8 76.1 76.9 78.5 79 ...
# $ t.60 : num 85.7 83.3 83.9 85 86.9 ...
# $ t.90 : num 93.1 88 86.8 88 89.7 ...
# $ t.120 : num 94.2 89.6 90.1 93.4 90.8 ...
# Estimation of the parameters for Hotelling's two-sample T2 statistic
# (for small samples)
res <-
get_hotellings(m1 = as.matrix(dip1[dip1$type == "R", c("t.15", "t.90")]),
m2 = as.matrix(dip1[dip1$type == "T", c("t.15", "t.90")]),
signif = 0.1)
res$S.pool
res$Parameters
# Expected results in res$S.pool
# t.15 t.90
# t.15 3.395808 1.029870
# t.90 1.029870 4.434833
# Expected results in res$Parameters
# DM df1 df2 signif K
# 1.044045e+01 2.000000e+00 9.000000e+00 1.000000e-01 1.350000e+00
# k T2 F F.crit p.F
# 3.000000e+00 3.270089e+02 1.471540e+02 3.006452e+00 1.335407e-07