AFStest {HDLSSkST}R Documentation

k-Sample AFS Test of Equal Distributions

Description

Performs the distribution free exact k-sample test for equality of multivariate distributions in the HDLSS regime. This an aggregate test of the two sample versions of the FS test over \frac{k(k-1)}{2} numbers of two-sample comparisons, and the test statistic is the minimum of these two sample FS test statistics. Holm's step-down-procedure (1979) and Benjamini-Hochberg procedure (1995) are applied for multiple testing.

Usage

AFStest(M, sizes, randomization = TRUE, clust_alg = "knwClustNo", kmax = 4,
multTest = "Holm", s_psi = 1, s_h = 1, lb = 1, n_sts = 1000, alpha = 0.05)

Arguments

M

n\times d observations matrix of pooled sample, the observations should be grouped by their respective classes

sizes

vector of sample sizes

randomization

logical; if TRUE (default), randomization test and FALSE, non-randomization test

clust_alg

"knwClustNo"(default) or "estclustNo"; modified K-means algorithm used for clustering

kmax

maximum value of total number of clusters to estimate total number of clusters for two-sample comparition, default: 4

multTest

"HOlm"(default) or "BenHoch"; different multiple tests

s_psi

function required for clustering, 1 for t^2, 2 for 1-\exp(-t), 3 for 1-\exp(-t^2), 4 for \log(1+t), 5 for t

s_h

function required for clustering, 1 for \sqrt t, 2 for t

lb

each observation is partitioned into some numbers of smaller vectors of same length lb, default: 1

n_sts

number of simulation of the test statistic, default: 1000

alpha

numeric, confidence level \alpha, default: 0.05

Value

AFStest returns a list containing the following items:

AFSStat

value of the observed test statistic

AFCutoff

cut-off of the test

randomGamma

randomized coefficient of the test

decisionAFS

if returns 1, reject the null hypothesis and if returns 0, fails to reject the null hypothesis

multipleTest

indicates where two populations are different according to multiple tests

Author(s)

Biplab Paul, Shyamal K. De and Anil K. Ghosh

Maintainer: Biplab Paul<paul.biplab497@gmail.com>

References

Biplab Paul, Shyamal K De and Anil K Ghosh (2021). Some clustering based exact distribution-free k-sample tests applicable to high dimension, low sample size data, Journal of Multivariate Analysis, doi:10.1016/j.jmva.2021.104897.

Cyrus R Mehta and Nitin R Patel (1983). A network algorithm for performing Fisher's exact test in rxc contingency tables, Journal of the American Statistical Association, 78(382):427-434, doi:10.2307/2288652.

Sture Holm (1979). A simple sequentially rejective multiple test procedure, Scandinavian journal of statistics, 65-70, doi:10.2307/4615733.

Yoav Benjamini and Yosef Hochberg (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing, Journal of the Royal statistical society: series B (Methodological) 57.1: 289-300, doi: 10.2307/2346101.

Examples

  # muiltivariate normal distribution:
  # generate data with dimension d = 500
  set.seed(151)
  n1=n2=n3=n4=10
  d = 500
  I1 <- matrix(rnorm(n1*d,mean=0,sd=1),n1,d)
  I2 <- matrix(rnorm(n2*d,mean=0.5,sd=1),n2,d) 
  I3 <- matrix(rnorm(n3*d,mean=1,sd=1),n3,d) 
  I4 <- matrix(rnorm(n4*d,mean=1.5,sd=1),n4,d) 
  X <- as.matrix(rbind(I1,I2,I3,I4)) 
  #AFS test:
  results <- AFStest(M=X, sizes = c(n1,n2,n3,n4))
  
   ## outputs:
   results$AFSStat
   #[1] 5.412544e-06

   results$AFCutoff
   #[1] 0.0109604

   results$randomGamma
   #[1] 0

   results$decisionAFS
   #[1] 1

   results$multipleTest
   #  Population.1 Population.2 rejected pvalues
   #1            1            2     TRUE       0
   #2            1            3     TRUE       0
   #3            1            4     TRUE       0
   #4            2            3     TRUE       0
   #5            2            4     TRUE       0
   #6            3            4     TRUE       0


[Package HDLSSkST version 2.1.0 Index]