R: ACDCmod

ACDCmod {modACDC}

R Documentation

ACDCmod

Description

ACDCmod detects differential co-expression between a set of genes, such as a module of co-expressed genes, and a set of external features (exposures or responses) by using canonical correlation analysis (CCA) on the external features and module co-expression values. Modules are provided by the user.

Usage

ACDCmod(
  fullData,
  modules,
  externalVar,
  identifierList = colnames(fullData),
  numNodes = 1
)

Arguments

`fullData`	data frame or matrix with samples as rows, all probes as columns; each entry should be numeric gene expression or other molecular data values
`modules`	vector of lists where each list contains indices of column locations in fullData that specify features in each module
`externalVar`	data frame, matrix, or vector containing external variable data to be used for CCA, rows are samples; all elements must be numeric
`identifierList`	optional row vector of identifiers, of the same length and order, corresponding to columns in fullData (ex: HUGO symbols for genes); default value is the column names from fullData
`numNodes`	number of available compute nodes for parallelization; default is 1

Details

For more information about how the co-expression features are calculated, see the coVar documentation.

Following CCA, which determines linear combinations of the co-expression and external feature vectors that maximize the cross-covariance matrix for each module, a Wilks-Lambda test is performed to determine if the correlation between these linear combinations is significant. If they are significant, that implies there is differential co-expression. If there is only one co-expression value for a module (ie two features in the module) and a single external variable, CCA reduces to a simple correlation test, and the t-distribution is used to test for significant correlation (Widmann, 2005). If the number of co-expression features in a particular module is larger than the number of samples, CCA will return correlation coefficients of 1, and p-values and BH FDR q-values will not be calculated. See ACDChighdim for our solution.

Value

Tibble, sorted by ascending BH FDR value, with columns

moduleNum: module identifier
colNames: list of column names from fullData of the features in the module
features: list of identifiers from input parameter "identifierList" for all features in the module
CCA_corr: list of CCA canonical correlation coefficients
CCA_pval: Wilks-Lamda F-test p-value; t-test p-value if there are only 2 features in the module and a single external variable
BHFDR_qval: Benjamini-Hochberg false discovery rate q-value

Author(s)

Katelyn Queen, kjqueen@usc.edu

References

Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal statistical society: series B (Methodological) 57 (1995) 289–300.

Martin P, et al. Novel aspects of PPARalpha-mediated regulation of lipid and xenobiotic metabolism revealed through a nutrigenomic study. Hepatology, in press, 2007.

Millstein J, Battaglin F, Barrett M, Cao S, Zhang W, Stintzing S, et al. Partition: a surjective mapping approach for dimensionality reduction. Bioinformatics 36 (2019) 676–681. doi:10.1093/bioinformatics/ btz661.

Queen K, Nguyen MN, Gilliland F, Chun S, Raby BA, Millstein J. ACDC: a general approach for detecting phenotype or exposure associated co-expression. Frontiers in Medicine (2023) 10. doi:10.3389/fmed.2023.1118824.

Widmann M. One-Dimensional CCA and SVD, and Their Relationship to Regression Maps. Journal of Climate 18 (2005) 2785–2792. doi:10.1175/jcli3424.1.

Examples

#load CCA package for example dataset
library(CCA)

# load dataset
data("nutrimouse")

# partition dataset and save modules
library(partition)
part <- partition(nutrimouse$lipid, threshold = 0.50)
mods <- part$mapping_key[which(grepl("reduced_var_", part$mapping_key$variable)), ]$mapping

# run function for diet and genotype
ACDCmod(fullData = nutrimouse$lipid,
        modules = mods,
        externalVar = data.frame(diet=as.numeric(nutrimouse$diet), 
                                  genotype=as.numeric(nutrimouse$genotype)))

[Package modACDC version 2.0.1 Index]