ACDCmod {modACDC}R Documentation

ACDCmod

Description

ACDCmod detects differential co-expression between a set of genes, such as a module of co-expressed genes, and a set of external features (exposures or responses) by using canonical correlation analysis (CCA) on the external features and module co-expression values. Modules are provided by the user.

Usage

ACDCmod(
  fullData,
  modules,
  externalVar,
  identifierList = colnames(fullData),
  numNodes = 1
)

Arguments

fullData

data frame or matrix with samples as rows, all probes as columns; each entry should be numeric gene expression or other molecular data values

modules

vector of lists where each list contains indices of column locations in fullData that specify features in each module

externalVar

data frame, matrix, or vector containing external variable data to be used for CCA, rows are samples; all elements must be numeric

identifierList

optional row vector of identifiers, of the same length and order, corresponding to columns in fullData (ex: HUGO symbols for genes); default value is the column names from fullData

numNodes

number of available compute nodes for parallelization; default is 1

Details

For more information about how the co-expression features are calculated, see the coVar documentation.

Following CCA, which determines linear combinations of the co-expression and external feature vectors that maximize the cross-covariance matrix for each module, a Wilks-Lambda test is performed to determine if the correlation between these linear combinations is significant. If they are significant, that implies there is differential co-expression. If there is only one co-expression value for a module (ie two features in the module) and a single external variable, CCA reduces to a simple correlation test, and the t-distribution is used to test for significant correlation (Widmann, 2005). If the number of co-expression features in a particular module is larger than the number of samples, CCA will return correlation coefficients of 1, and p-values and BH FDR q-values will not be calculated. See ACDChighdim for our solution.

Value

Tibble, sorted by ascending BH FDR value, with columns

moduleNum

module identifier

colNames

list of column names from fullData of the features in the module

features

list of identifiers from input parameter "identifierList" for all features in the module

CCA_corr

list of CCA canonical correlation coefficients

CCA_pval

Wilks-Lamda F-test p-value; t-test p-value if there are only 2 features in the module and a single external variable

BHFDR_qval

Benjamini-Hochberg false discovery rate q-value

Author(s)

Katelyn Queen, kjqueen@usc.edu

References

Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal statistical society: series B (Methodological) 57 (1995) 289–300.

Martin P, et al. Novel aspects of PPARalpha-mediated regulation of lipid and xenobiotic metabolism revealed through a nutrigenomic study. Hepatology, in press, 2007.

Millstein J, Battaglin F, Barrett M, Cao S, Zhang W, Stintzing S, et al. Partition: a surjective mapping approach for dimensionality reduction. Bioinformatics 36 (2019) 676–681. doi:10.1093/bioinformatics/ btz661.

Queen K, Nguyen MN, Gilliland F, Chun S, Raby BA, Millstein J. ACDC: a general approach for detecting phenotype or exposure associated co-expression. Frontiers in Medicine (2023) 10. doi:10.3389/fmed.2023.1118824.

Widmann M. One-Dimensional CCA and SVD, and Their Relationship to Regression Maps. Journal of Climate 18 (2005) 2785–2792. doi:10.1175/jcli3424.1.

Examples

#load CCA package for example dataset
library(CCA)

# load dataset
data("nutrimouse")

# partition dataset and save modules
library(partition)
part <- partition(nutrimouse$lipid, threshold = 0.50)
mods <- part$mapping_key[which(grepl("reduced_var_", part$mapping_key$variable)), ]$mapping

# run function for diet and genotype
ACDCmod(fullData = nutrimouse$lipid,
        modules = mods,
        externalVar = data.frame(diet=as.numeric(nutrimouse$diet), 
                                  genotype=as.numeric(nutrimouse$genotype)))


[Package modACDC version 2.0.1 Index]