jt.test {kSamples}R Documentation

Jonckheere-Terpstra k-Sample Test for Increasing Alternatives

Description

The Jonckheere-Terpstra k-sample test statistic JT is defined as JT=i<jWijJT = \sum_{i<j} W_{ij} where WijW_{ij} is the Mann-Whitney statistic comparing samples ii and jj, indexed in the order of the stipulated increasing alternative. There may be ties in the pooled samples.

Usage

jt.test(..., data = NULL, method=c("asymptotic","simulated","exact"),
		dist = FALSE, Nsim = 10000) 

Arguments

...

Either several sample vectors, say x1,,xkx_1, \ldots, x_k, with xix_i containing nin_i sample values. ni>4n_i > 4 is recommended for reasonable asymptotic PP-value calculation. The pooled sample size is denoted by N=n1++nkN=n_1+\ldots+n_k. The order of samples should be as stipulated under the alternative

or a list of such sample vectors,

or a formula y ~ g, where y contains the pooled sample values and g (same length as y) is a factor with levels identifying the samples to which the elements of y belong, the factor levels reflecting the order under the stipulated alternative,

data

= an optional data frame providing the variables in formula y ~ g.

method

= c("asymptotic","simulated","exact"), where

"asymptotic" uses only an asymptotic normal PP-value approximation.

"simulated" uses Nsim simulated JTJT statistics based on random splits of the pooled samples into samples of sizes n1,,nkn_1, \ldots, n_k, to estimate the PP-value.

"exact" uses full enumeration of all sample splits with resulting JTJT statistics to obtain the exact PP-value. It is used only when Nsim is at least as large as the number

ncomb=N!n1!nk!ncomb = \frac{N!}{n_1!\ldots n_k!}

of full enumerations. Otherwise, method reverts to "simulated" using the given Nsim. It also reverts to "simulated" when ncomb>1e8ncomb > 1e8 and dist = TRUE.

dist

= FALSE (default) or TRUE. If TRUE, the simulated or fully enumerated distribution vector null.dist is returned for the JT test statistic. Otherwise, NULL is returned. When dist = TRUE then Nsim <- min(Nsim, 1e8), to limit object size.

Nsim

= 10000 (default), number of simulation sample splits to use. It is only used when method = "simulated", or when method = "exact" reverts to method = "simulated", as previously explained.

Details

The JT statistic is used to test the hypothesis that the samples all come from the same but unspecified continuous distribution function F(x)F(x). It is specifically aimed at alternatives where the sampled distributions are stochastically increasing.

NA values are removed and the user is alerted with the total NA count. It is up to the user to judge whether the removal of NA's is appropriate.

The continuity assumption can be dispensed with, if we deal with independent random samples, or if randomization was used in allocating subjects to samples or treatments, and if we view the simulated or exact PP-values conditionally, given the tie pattern in the pooled samples. Of course, under such randomization any conclusions are valid only with respect to the group of subjects that were randomly allocated to their respective samples. The asymptotic PP-value calculation is valid provided all sample sizes become large.

Value

A list of class kSamples with components

test.name

"Jonckheere-Terpstra"

k

number of samples being compared

ns

vector (n1,,nk)(n_1,\ldots,n_k) of the kk sample sizes

N

size of the pooled sample =n1++nk= n_1+\ldots+n_k

n.ties

number of ties in the pooled sample

qn

4 (or 5) vector containing the observed JTJT, its mean and standard deviation and its asymptotic PP-value, (and its simulated or exact PP-value)

warning

logical indicator, warning = TRUE when at least one ni<5n_i < 5

null.dist

simulated or enumerated null distribution of the test statistic. It is NULL when dist = FALSE or when method = "asymptotic".

method

the method used.

Nsim

the number of simulations used.

References

Harding, E.F. (1984), An Efficient, Minimal-storage Procedure for Calculating the Mann-Whitney U, Generalized U and Similar Distributions, Appl. Statist. 33 No. 1, 1-6.

Jonckheere, A.R. (1954), A Distribution Free k-sample Test against Ordered Alternatives, Biometrika, 41, 133-145.

Lehmann, E.L. (2006), Nonparametrics, Statistical Methods Based on Ranks, Revised First Edition, Springer Verlag.

Terpstra, T.J. (1952), The Asymptotic Normality and Consistency of Kendall's Test against Trend, when Ties are Present in One Ranking, Indagationes Math. 14, 327-333.

Examples

x1 <- c(1,2)
x2 <- c(1.5,2.1)
x3 <- c(1.9,3.1)
yy <- c(x1,x2,x3)
gg <- as.factor(c(1,1,2,2,3,3))
jt.test(x1, x2, x3,method="exact",Nsim=90)
# or 
# jt.test(list(x1, x2, x3), method = "exact", Nsim = 90)
# or
# jt.test(yy ~ gg, method = "exact", Nsim = 90)

[Package kSamples version 1.2-10 Index]