R: Graphical Gaussian Models: Assess Significance of Edges (and...

network.test.edges {GeneNet}

R Documentation

Graphical Gaussian Models: Assess Significance of Edges (and Directions)

Description

network.test.edges returns a data frame containing all edges listed in order of the magnitude of the partial correlation associated with each edge. If fdr=TRUE then in addition the p-values, q-values and posterior probabilities (=1 - local fdr) for each potential edge are computed.

extract.network returns a data frame with a subset of significant edges.

Usage

network.test.edges(r.mat, fdr=TRUE, direct=FALSE, plot=TRUE, ...)
extract.network(network.all, method.ggm=c("prob", "qval","number"), 
      cutoff.ggm=0.8, method.dir=c("prob","qval","number", "all"), 
      cutoff.dir=0.8, verbose=TRUE)

Arguments

`r.mat`	matrix of partial correlations
`fdr`	estimate q-values and local fdr
`direct`	compute additional statistics for obtaining a partially directed network
`plot`	plot density and distribution function and (local) fdr values
`...`	parameters passed on to `fdrtool`
`network.all`	list with partial correlations and fdr values for all potential edges (i.e. the output of `network.test.edges`
`method.ggm`	determines which criterion is used to select significant partial correlations (default: prob)
`cutoff.ggm`	default cutoff for significant partial correlations
`method.dir`	determines which criterion is used to select significant directions (default: prob)
`cutoff.dir`	default cutoff for significant directions
`verbose`	print information on the number of significant edges etc.

Details

For assessing the significance of edges in the GGM a mixture model is fitted to the partial correlations using fdrtool. This results in (i) two-sided p-values for the test of non-zero correlation, (ii) corresponding posterior probabilities (= 1- local fdr), as well as (iii) tail area-based q-values. See Sch\"afer and Strimmer (2005) for details.

For determining putatative directions on this GGM log-ratios of standardized partial variances re estimated, and subsequently the corresponding (local) fdr values are computed - see Opgen-Rhein and Strimmer (2007).

Value

network.test.edges returns a data frame with the following columns:

`pcor`	correlation (from r.mat)
`node1`	first node connected to edge
`node2`	second node connected to edge
`pval`	p-value
`qval`	q-value
`prob`	probability that edge is nonzero (= 1-local fdr
`log.spvar`	log ratio of standardized partial variance (determines direction)
`pval.dir`	p-value (directions)
`qval.dir`	q-value (directions)
`prob.dir`	1-local fdr (directions)

Each row in the data frame corresponds to one edge, and the rows are sorted according the absolute strength of the correlation (from strongest to weakest)

extract.network processes the above data frame containing all potential edges, and returns a dataframe with a subset of edges. If applicable, an additional last column (11) contains additional information on the directionality of an edge.

Author(s)

Rainer Opgen-Rhein, Juliane Sch\"afer, Korbinian Strimmer (https://strimmerlab.github.io).

References

Sch\"afer, J., and Strimmer, K. (2005). An empirical Bayes approach to inferring large-scale gene association networks. Bioinformatics 21:754-764.

Opgen-Rhein, R., and K. Strimmer. (2007). From correlation to causation networks: a simple approximate learning algorithm and its application to high-dimensional plant gene expression data. BMC Syst. Biol. 1:37.

Examples

# load GeneNet library
library("GeneNet")
 
# ecoli data 
data(ecoli)

# estimate partial correlation matrix 
inferred.pcor <- ggm.estimate.pcor(ecoli)

# p-values, q-values and posterior probabilities for each potential edge 
#
test.results <- network.test.edges(inferred.pcor)

# show best 20 edges (strongest correlation)
test.results[1:20,]

# extract network containing edges with prob > 0.9 (i.e. local fdr < 0.1)
net <- extract.network(test.results, cutoff.ggm=0.9)
net

# how many are significant based on FDR cutoff Q=0.05 ?
num.significant.1 <- sum(test.results$qval <= 0.05)
test.results[1:num.significant.1,]

# how many are significant based on "local fdr" cutoff (prob > 0.9) ?
num.significant.2 <- sum(test.results$prob > 0.9)
test.results[test.results$prob > 0.9,]

# parameters of the mixture distribution used to compute p-values etc.
c <- fdrtool(sm2vec(inferred.pcor), statistic="correlation")
c$param

[Package GeneNet version 1.2.16 Index]