R: Inference for four multivariate coefficients of variation

GFDmcv {GFDmcv}

R Documentation

Inference for four multivariate coefficients of variation

Description

The function GFDmcv() calculates the Wald-type statistic for global null hypotheses and max-type statistics for multiple local null hypotheses, both in terms of the four variants of the multivariate coefficient of variation. Respective p-values are obtained by a \chi^2-approximation, a pooled bootstrap strategy and a pooled permutation approach (only for the Wald-type statistic), respectively.

Usage

GFDmcv(
  x,
  h_mct,
  h_wald,
  alpha = 0.05,
  n_perm = 1000,
  n_boot = 1000,
  parallel = FALSE,
  n_cores = NULL
)

Arguments

`x`	a list of length `k` with elements being `n_i\times d` matrices of data, `i=1,\dots,k`.
`h_mct`	a `r\times k` contrast matrix `\mathbf{H}` of full row rank for multiple contrast tests. Remember to specify it correctly taking into account the order of elements of the list `x`.
`h_wald`	a `q\times k` contrast matrix `\mathbf{H}` of full row rank for the Wald-type tests. Remember to specify it correctly taking into account the order of elements of the list `x`.
`alpha`	a significance level (then `1-alpha` is the confidence level).
`n_perm`	a number of permutation replicates.
`n_boot`	a number of bootstrap replicates.
`parallel`	a logical indicating whether to use parallelization.
`n_cores`	if `parallel = TRUE`, a number of processes used in parallel computation. Its default value means that it will be equal to a number of cores of a computer used.

Details

The function GFDmcv() calculates the Wald-type statistic for global null hypotheses of the form

\mathcal H_0: \mathbf{H} (C_1,\ldots,C_k)^\top = \mathbf{0}\,\,\text{and}\,\,\mathcal H_0: \mathbf{H} (B_1,\ldots,B_k)^\top = \mathbf{0},

where \mathbf{H} is a contrast matrix reflecting the research question of interest and C_i (B_i) are the subgroup-specific MCVs (and their reciprocal) by Reyment (1960, RR), Van Valen (1974, VV), Voinov and Nikulin (1996, VN) or Albert and Zhang (2010, AZ), respectively. We refer to the function e_mcv() for the detailed definitions of the different variants. The p-value of the Wald-type statistic relies on a \chi^2-approximation, a (pooled) bootstrap or permutation approach.

Furthermore, the function GFDmcv() calculates a max-type test statistic for the multiple comparison of q local null hypotheses:

\mathcal H_{0,\ell}: \mathbf{h_\ell}^\top \mathbf{C} = \mathbf{0}\,\, \text{or}\,\,\mathcal H_{0,\ell}: \mathbf{h_\ell}^\top \mathbf{B} = \mathbf{0}, \,\,\ell=1,\ldots,q,

where \mathbf{C}=(C_1,\ldots,C_k)^\top and \mathbf{B}=(B_1,\ldots,B_k)^\top. The p-values are determined by a Gaussian approximation and a bootstrap approach, respectively. In addition to the local test decisions, multiple adjusted confidence intervals for the contrasts \mathbf{h_{\ell}^{\top}\pmb{C}} and \mathbf{h_{\ell}^{\top}\pmb{B}}, respectively, are calculated.

Please have a look on the plot and summary functions designed for this package. They can be used to simplify the output of GFDmcv().

Value

A list of class gfdmcv containing the following components:

`overall_res`	a list of two elements representing the results for testing the global null hypothesis. The first one is a matrix `test_stat` of the test statistics, while the second is a matrix `p_values` of the respective `p`-values.
`mct_res`	all results of MCT tests for particular hypothesis in `h_mct`, i.e., the estimators and simultaneous confidence intervals for `\mathbf{h_{\ell}^{\top}\pmb{C}}` and for `\mathbf{h_{\ell}^{\top}\pmb{B}}`, the test statistics and critical values as well as the decisions.
`h_mct`	an argument `h_mct`.
`h_wald`	an argument `h_wald`.
`alpha`	an argument `alpha`.

References

Albert A., Zhang L. (2010) A novel definition of the multivariate coefficient of variation. Biometrical Journal 52:667-675.

Ditzhaus M., Smaga L. (2022) Permutation test for the multivariate coefficient of variation in factorial designs. Journal of Multivariate Analysis 187, 104848.

Ditzhaus M., Smaga L. (2023) Inference for all variants of the multivariate coefficient of variation in factorial designs. Preprint https://arxiv.org/abs/2301.12009.

Reyment R.A. (1960) Studies on Nigerian Upper Cretaceous and Lower Tertiary Ostracoda: part 1. Senonian and Maastrichtian Ostracoda, Stockholm Contributions in Geology, vol 7.

Van Valen L. (1974) Multivariate structural statistics in natural history. Journal of Theoretical Biology 45:235-247.

Voinov V., Nikulin M. (1996) Unbiased Estimators and Their Applications, Vol. 2, Multivariate Case. Kluwer, Dordrecht.

Examples

# Some of the examples may run some time.

# one-way analysis for MCV and CV
# d > 1 (MCV)
data_set <- lapply(list(iris[iris$Species == "setosa", 1:3],
                        iris[iris$Species == "versicolor", 1:3],
                        iris[iris$Species == "virginica", 1:3]),
                   as.matrix)
# estimators and confidence intervals of MCVs and their reciprocals
lapply(data_set, e_mcv)
# contrast matrices
k <- length(data_set)
# Tukey's contrast matrix
h_mct <- contr_mat(k, type = "Tukey")
# centering matrix P_k
h_wald <- contr_mat(k, type = "center")

# testing without parallel computing
res <- GFDmcv(data_set, h_mct, h_wald)
summary(res, digits = 3)
oldpar <- par(mar = c(4, 5, 2, 0.3))
plot(res)
par(oldpar)

# testing with parallel computing
library(doParallel)
res <- GFDmcv(data_set, h_mct, h_wald, parallel = TRUE, n_cores = 2)
summary(res, digits = 3)
oldpar <- par(mar = c(4, 5, 2, 0.3))
plot(res)
par(oldpar)

# two-way analysis for CV (based on the example in Ditzhaus and Smaga, 2022)
library(HSAUR)
data_set <- lapply(list(BtheB$bdi.pre[BtheB$drug == "No" & BtheB$length == "<6m"],
                        BtheB$bdi.pre[BtheB$drug == "No" & BtheB$length == ">6m"],
                        BtheB$bdi.pre[BtheB$drug == "Yes" & BtheB$length == "<6m"],
                        BtheB$bdi.pre[BtheB$drug == "Yes" & BtheB$length == ">6m"]), 
                   as.matrix)
# estimators and confidence intervals of CV and its reciprocal
lapply(data_set, e_mcv)

# interaction
h_mct <- contr_mat(4, type = "Tukey")
h_wald <- kronecker(contr_mat(2, type = "center"), 
                    contr_mat(2, type = "center"))
res <- GFDmcv(data_set, h_mct, h_wald)
summary(res, digits = 3)
oldpar <- par(mar = c(4, 6, 2, 0.1))
plot(res)
par(oldpar)

# main effect drug
h_mct <- matrix(c(1, 1, -1, -1), nrow = 1)
h_wald <- kronecker(contr_mat(2, type = "center"), 0.5 * matrix(1, 1, 2))
res <- GFDmcv(data_set, h_mct, h_wald)
summary(res, digits = 3)
oldpar <- par(mar = c(4, 6, 2, 0.1))
plot(res)
par(oldpar)

# main effect length
h_mct <- matrix(c(1, -1, 1, -1), nrow = 1)
h_wald <- kronecker(0.5 * matrix(1, 1, 2), contr_mat(2, type = "center"))
res <- GFDmcv(data_set, h_mct, h_wald)
summary(res, digits = 3)
oldpar <- par(mar = c(4, 6, 2, 0.1))
plot(res)
par(oldpar)

[Package GFDmcv version 0.1.0 Index]