dnapath {dnapath}R Documentation

Differential Network Analysis Using Gene Pathways

Description

Integrates pathways into the differential network analysis of gene expression data (Grimes et al. 2019).

Usage

dnapath(
  x,
  pathway_list,
  group_labels = NULL,
  network_inference = run_pcor,
  n_perm = 100,
  lp = 2,
  seed = NULL,
  verbose = FALSE,
  mc.cores = 1,
  ...
)

Arguments

x

The gene expression data to be analyzed. This can be either (1) a list of two matrices or data frames that contain the gene expression profile from each of two populations (groups) – with rows corresponding to samples and columns to genes – or (2) a single matrix or data frame that contains the expression profiles for both groups. For case (2), the group_labels argument must be specified to identify which rows belong to which group.

pathway_list

A single vector or list of vectors containing gene names to indicate pathway membership. The vectors are used to subset the columns of the matrices in x. A pathway list can be obtained using get_reactome_pathways. If NULL, then the entire expression dataset is analyzed as a single network (this approach is not recommended unless there are only a small number of genes).

group_labels

If x is a single matrix or data frame, group_labels must be specified to label each row. group_labels is a matrix each row corresponding to a in x. This matrix may either (1) have a single column containing the group label for each observation, or (2) individual columns representing each group with values in ⁠[0, 1]⁠ representing the probability that the patient in that row is in each group. In the latter case, if the rows do not sum to 1, then each entry will be divided by its row sum.

network_inference

A function used to infer the pathway network. It should take in an n by p matrix and return a p by p matrix of association scores. (Built-in options include: run_aracne, run_bc3net, run_c3net, run_clr, run_corr, run_dwlasso, run_genie3, run_glasso, run_mrnet, run_pcor, and run_silencer.) Defaults to run_pcor for partial correlations.

n_perm

The number of random permutations to perform during permutation testing. If n_perm == 1, the permutation tests are not performed. If n_perm is larger than the number of possible permutations, n_perm will be set to this value with a warning message.

lp

The lp value used to compute differential connectivity scores. (Note: If a vector is provided, then the results are returned as a list of dnapath_list objects, one result for each value of lp. This option is available so that network inference methods only need to be run once for each pathway when multple values of lp are being considered. This may be useful when conducting simulation studies).

seed

(Optional) Used to set.seed prior to permutation test for each pathway. This allows results for individual pathways to be easily reproduced.

verbose

Set to TRUE to turn on messages.

mc.cores

Used in mclapply to run the differential network analysis in parallel across pathways. Must be set to 1 if on a Windows machine.

...

Additional arguments are passed into the network inference function.

Value

A 'dnapath_list' or 'dnapath' object containing results for each pathway in pathway_list.

References

Grimes T, Potter SS, Datta S (2019). “Integrating Gene Regulatory Pathways into Differential Network Analysis of Gene Expression Data.” Scientific reports, 9(1), 5479.

See Also

filter_pathways, summary.dnapath_list subset.dnapath_list, sort.dnapath_list, plot.dnapath, rename_genes

Examples

data(meso)
data(p53_pathways)
set.seed(0)
results <- dnapath(x = meso$gene_expression, pathway_list = p53_pathways,
                   group_labels = meso$groups, n_perm = 10)
results
summary(results) # Summary over all pathways in the pathway list.
# Remove results for pathways with p-values above 0.2.
top_results <- filter_pathways(results, 0.2)
# Sort the top results by the pathway DC score.
top_results <- sort(top_results, by = "dc_score")
top_results
summary(top_results[[1]])  # Summary of pathway 1.
plot(results[[1]]) # Plot of the differential network for pathway 1.

# Use ... to adjust arguments in the network inference function.
# For example, using run_corr() with method = "spearman":
results <- dnapath(x = meso$gene_expression, pathway_list = p53_pathways,
                   group_labels = meso$groups, n_perm = 10,
                   network_inference = run_corr,
                   method = "spearman")
results

[Package dnapath version 0.7.4 Index]