DiProPerm {diproperm}R Documentation

Conducts a DiProPerm test

Description

This package conducts a Direction-Projection-Permutation (DiProPerm) test. DiProPerm is a two-sample hypothesis test for comparing two high-dimensional distributions. The DiProPerm test is exact, i.e., the type I error is guaranteed to be controlled at the nominal level for any sample size. For more details see Wei et al. (2016).

Usage

DiProPerm(
  X,
  y,
  B = 1000,
  classifier = "dwd",
  univ.stat = "md",
  balance = TRUE,
  alpha = 0.05,
  cores = 2
)

Arguments

X

An nxp data matrix.

y

A vector of n binary class labels -1 and 1.

B

The number of permutations for the DiProPerm test. The default is 1000.

classifier

A string designating the binary linear classifier. classifier="dwd", distance weighted discrimination (DWD), is the default. classifier="dwd" implements a generalized DWD model from the genDWD function in the DWDLargeR package. The penalty parameter, C, in the genDWD function is calculated using the penaltyParameter function in DWDLargeR. The genDWD and penaltyParameter functions have several arguments which are set to recommended default values. More details on the algorithm used to calculate the DWD solution can be found in Lam et al. (2018). Other options for the binary classifier include the "md", mean difference direction, and "svm", support vector machine. The "svm" option uses the default implementation from svm.

univ.stat

A string indicating the univariate statistic used for the projection step. univ.stat="md", the mean difference, is the default.

balance

A logical indicator for whether a balanced permutation design should be implemented. The default is TRUE.

alpha

An integer indicating the level of significance. The default is 0.05.

cores

An integer indicating the number of cores to be used for parallel processing. The default is 2. Note, parallel processing is only available on MacOS and Ubuntu operating systems at this time. Windows users will default to using 1 core.

Value

A list containing:

X

The observed nxp data matrix.

y

The observed vector of n binary class labels -1 and 1.

obs_teststat

The observed univariate test statistic.

xw

Projection scores used to compute the specified univariate statistic.

w

The loadings of the binary classification.

Z

The Z score of the observed test statistic.

cutoff_value

The cutoff value to achieve an alpha level of significance.

pvalue

The pvalue from the permutation test.

perm_dist

A list containing the permuted projection scores and permuted class labels for each permutation.

perm_stats

A B dimensional vector of univariate test statistics.

Author(s)

Andrew G. Allmon, J.S. Marron, Michael G. Hudgens

References

Lam, X. Y., Marron, J. S., Sun, D., & Toh, K.-C. (2018). Fast Algorithms for Large-Scale Generalized Distance Weighted Discrimination. Journal of Computational and Graphical Statistics, 27(2), 368–379. doi: 10.1080/10618600.2017.1366915

Wei, S., Lee, C., Wichers, L., & Marron, J. S. (2016). Direction-Projection-Permutation for High-Dimensional Hypothesis Tests. Journal of Computational and Graphical Statistics, 25(2), 549–569. doi: 10.1080/10618600.2015.1027773

Examples

data(mushrooms)
X <- Matrix::t(mushrooms$X)
y <- mushrooms$y
dpp <- DiProPerm(X=X,y=y,B=10)




[Package diproperm version 0.2.0 Index]