scMANOVA {semicontMANOVA}R Documentation

Multivariate ANalysis Of VAriance Inference and Test with Ridge Regularization for Semicontinuous High-Dimensional Data

Description

scMANOVA performs Multivariate ANalysis Of VAriance (MANOVA) inference and test with ridge regularization in presence of semicontinuous high-dimensional data. The test is based on a Likelihood Ratio Test statistic and the p-value can be computed using either asymptotic distribution (p.value.perm = FALSE) or via permutation procedure (p.value.perm = TRUE). There is the possibility to provide as input the regularization parameters or to choose them through an optimization procedure.

Usage

scMANOVA(x, n, lambda = NULL, lambda0 = NULL, lambda.step = 0.1,
  ident = FALSE, tol = 1e-08, penalty = function(n, p) log(n),
  B = 500, p.value.perm = FALSE, fixed.lambda = FALSE, ...)

Arguments

x

data.frame or matrix of data with units on the rows and variables on the columns

n

vector. The length corresponds to the number of groups, the elements to the number of observations in each group

lambda

NULL, a scalar or a vector of length 2. Ridge regularization parameter. The optimal value of lambda is searched in the interval [0,100] if NULL, and in the specified interval when it is a vector of length 2, otherwise it is used as the optimal value

lambda0

NULL, a scalar or a vector of length 2. Ridge regularization parameter under null hypothesis. The optimal value of lambda0 is searched in the interval [0,100] if NULL, and in the specified interval when it is a vector of length 2, otherwise it is used as the optimal value

lambda.step

scalar. Step size used in the optimization procedure to find the smallest value of lambda (and lambda0) that makes the covariance matrices, under the alternative and under the null hypotheses, non singular

ident

logical. If TRUE, lambda times the identity matrix is added to the raw estimated covariance matrix, if FALSE the diagonal values of the raw estimated covariance matrix are used instead

tol

scalar. Used in the optimization procedure to find the smallest value of lambda (and lambda0) that makes the covariance matrices, under the alternative and under the null, non singular

penalty

function with two arguments: sample size (n) and number of variables (p) used as penalty function in the definition of the Information Criterion to select the optimal values for lambda and lambda0

B

scalar. Number of permutations to run in the permutation test

p.value.perm

logical. If TRUE a p-value from a permutation test is evaluated, otherwise an asymptotic value is reported

fixed.lambda

logical. If TRUE the optimal values for lambda and lambda0 are evaluated just once for the observed dataset and kept fixed during the permutation test, otherwise, optimal values are evaluated for each permuted datsets

...

further parameters passed to function scMANOVApermTest

Value

An object of class scMANOVA which is a list with the following components

pi

matrix. Estimated proportion of missing values for each group

mu

matrix. Estimated mean vector for each group

sigmaRidge

matrix. Estimated covariance matrix with ridge regularization

sigma

matrix. Estimated covariance matrix by maximum likelihood

pi0

vector. Estimated proportion of missing values under the null hypothesis

mu0

vector. Estimated mean vector under the null hypothesis

sigma0Ridge

matrix. Estimated covariance matrix with ridge regularization under null hypothesis

sigma0

matrix. Estimated covariance matrix by maximum likelihood under null hypothesis

removed.vars

vector or NULL. columns removed in the continuous part of the log-likelihood dues to insufficient number of observations in each group

logLikPi

scalar. Log-likelihood for the discrete part of the model

logLik

scalar. Log-likelihood

logLikPi0

scalar. Log-likelihood for the discrete part of the model under the null hypothesis

logLik0

scalar. Log-likelihood under null hypothesis

statistic

scalar. Wilks statistics

lambda

scalar. Regularization parameter

lambda0

scalar. Regularization parameter under null hypothesis

df

scalar. Model degree of freedom

df0

scalar. Model degree of freedom under null hypothesis

aic

scalar. Information criteria

aic0

scalar. Information criteria under null hypothesis

p.value

scalar. p-value of the Wilks statistic

Author(s)

Elena Sabbioni, Claudio Agostinelli and Alessio Farcomeni

References

Elena Sabbioni, Claudio Agostinelli and Alessio Farcomeni (2024) A regularized MANOVA test for semicontinuous high-dimensional data. arXiv: http://arxiv.org/abs/2401.04036

See Also

scMANOVAestimation and scMANOVApermTest

Examples

  set.seed(1234)
  n <- c(5,5)
  p <- 20
  pmiss <- 0.1
  x <- scMANOVAsimulation(n=n, p=p, pmiss=pmiss)
  res.asy <- scMANOVA(x=x, n=n) # Asymptotic p.value
  res.asy
  
    res.perm <- scMANOVA(x=x, n=n, p.value.perm=TRUE) # p-value by permutation test 
    res.perm
  

[Package semicontMANOVA version 0.1-8 Index]