FStest {HDLSSkST} | R Documentation |
k-Sample FS Test of Equal Distributions
Description
Performs the distribution free exact k-sample test for equality of multivariate distributions in the HDLSS regime.
Usage
FStest(M, labels, sizes, n_clust, randomization = TRUE, clust_alg = "knwClustNo",
kmax = 2 * n_clust, s_psi = 1, s_h = 1, lb = 1, n_sts = 1000, alpha = 0.05)
Arguments
M |
|
labels |
length |
sizes |
vector of sample sizes |
n_clust |
number of the Populations |
randomization |
logical; if TRUE (default), randomization test and FALSE, non-randomization test |
clust_alg |
|
kmax |
maximum value of total number of clusters to estimate total number of clusters in the whole observations, default: |
s_psi |
function required for clustering, 1 for |
s_h |
function required for clustering, 1 for |
lb |
each observation is partitioned into some numbers of smaller vectors of same length |
n_sts |
number of simulation of the test statistic, default: |
alpha |
numeric, confidence level |
Value
FStest returns a list containing the following items:
estClustLabel |
a vector of length |
obsCtyTab |
observed contingency table |
ObservedProb |
value of the observed test statistic |
FCutoff |
cut-off of the test |
randomGamma |
randomized coefficient of the test |
estPvalue |
estimated p-value of the test |
decisionF |
if returns |
estClustNo |
total number of the estimated classes |
Author(s)
Biplab Paul, Shyamal K. De and Anil K. Ghosh
Maintainer: Biplab Paul<paul.biplab497@gmail.com>
References
Biplab Paul, Shyamal K De and Anil K Ghosh (2021). Some clustering based exact distribution-free k-sample tests applicable to high dimension, low sample size data, Journal of Multivariate Analysis, doi:10.1016/j.jmva.2021.104897.
Cyrus R Mehta and Nitin R Patel (1983). A network algorithm for performing Fisher's exact test in rxc contingency tables, Journal of the American Statistical Association, 78(382):427-434, doi:10.2307/2288652.
Examples
# muiltivariate normal distribution:
# generate data with dimension d = 500
set.seed(151)
n1=n2=n3=n4=10
k = 4
d = 500
I1 <- matrix(rnorm(n1*d,mean=0,sd=1),n1,d)
I2 <- matrix(rnorm(n2*d,mean=0.5,sd=1),n2,d)
I3 <- matrix(rnorm(n3*d,mean=1,sd=1),n3,d)
I4 <- matrix(rnorm(n4*d,mean=1.5,sd=1),n4,d)
levels <- c(rep(0,n1), rep(1,n2), rep(2,n3), rep(3,n4))
X <- as.matrix(rbind(I1,I2,I3,I4))
#FS test:
results <- FStest(M=X, labels=levels, sizes = c(n1,n2,n3,n4), n_clust = k)
## outputs:
results$estClustLabel
#[1] 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3
results$obsCtyTab
# [,1] [,2] [,3] [,4]
#[1,] 10 0 0 0
#[2,] 0 10 0 0
#[3,] 0 0 10 0
#[4,] 0 0 0 10
results$ObservedProb
#[1] 2.125236e-22
results$FCutoff
#[1] 1.115958e-07
results$randomGamma
#[1] 0
results$estPvalue
#[1] 0
results$decisionF
#[1] 1