R: Search algorithm in graphical models using marginal...

bdgraph.mpl {BDgraph}

R Documentation

Search algorithm in graphical models using marginal pseudo-likehlihood

Description

This function consists of several sampling algorithms for Bayesian model determination in undirected graphical models based on mariginal pseudo-likelihood. To speed up the computations, the birth-death MCMC sampling algorithms are implemented in parallel using OpenMP in C++.

Usage

bdgraph.mpl( data, n = NULL, method = "ggm", transfer = TRUE, 
             algorithm = "bdmcmc", iter = 5000, burnin = iter / 2, 
             g.prior = 0.2, g.start = "empty", 
             jump = NULL, alpha = 0.5, save = FALSE, 
             cores = NULL, operator = "or", verbose = TRUE )

Arguments

`data`	there are two options: (1) an (`n \times p`) `matrix` or a `data.frame` corresponding to the data, (2) an (`p \times p`) covariance matrix as `S=X'X` which `X` is the data matrix (`n` is the sample size and `p` is the number of variables). It also could be an object of class "`sim`", from function `bdgraph.sim`. The input matrix is automatically identified by checking the symmetry.
`n`	number of observations. It is needed if the "`data`" is a covariance matrix.
`method`	character with two options "`ggm`" (default), "`dgm`" and "`dgm-binary`". Option "`ggm`" is for Gaussian graphical models based on Gaussianity assumption. Option "`dgm`" is for discrete graphical models for the count data. Option "`dgm-binary`" is for discrete graphical models for the data that are binary.
`transfer`	for only `'count'` data which `method` = "`dgm`" or `method` = "`dgm-binary`".
`algorithm`	character with two options "`bdmcmc`" (default) and "`rjmcmc`". Option "`bdmcmc`" is based on birth-death MCMC algorithm. Option "`rjmcmc`" is based on reverible jump MCMC algorithm. Option `"hc"` is based on hill-climbing algorithm; this algorithm is only for count data which `method` = "`dgm`" or `method` = "`dgm-binary`".
`iter`	number of iteration for the sampling algorithm.
`burnin`	number of burn-in iteration for the sampling algorithm.
`g.prior`	for determining the prior distribution of each edge in the graph. There are two options: a single value between `0` and `1` (e.g. `0.5` as a noninformative prior) or an (`p \times p`) matrix with elements between `0` and `1`.
`g.start`	corresponds to a starting point of the graph. It could be an (`p \times p`) matrix, "`empty`" (default), or "`full`". Option "`empty`" means the initial graph is an empty graph and "`full`" means a full graph. It also could be an object with `S3` class "`bdgraph`" of `R` package `BDgraph` or the class `"ssgraph"` of `R` package `ssgraph::ssgraph()`; this option can be used to run the sampling algorithm from the last objects of previous run (see examples).
`jump`	it is only for the BDMCMC algorithm (`algorithm` = "`bdmcmc`"). It is for simultaneously updating multiple links at the same time to update graph in the BDMCMC algorithm.
`alpha`	value of the hyper parameter of Dirichlet, which is a prior distribution.
`save`	logical: if FALSE (default), the adjacency matrices are NOT saved. If TRUE, the adjacency matrices after burn-in are saved.
`cores`	number of cores to use for parallel execution. The case `cores` = "`all`" means all CPU cores to use for parallel execution.
`operator`	character with two options "`or`" (default) and "`and`". It is for hill-climbing algorithm.
`verbose`	logical: if TRUE (default), report/print the MCMC running time.

Value

An object with S3 class "bdgraph" is returned:

p_links

upper triangular matrix which corresponds the estimated posterior probabilities of all possible links.

For the case "save = TRUE" is returned:

`sample_graphs`	vector of strings which includes the adjacency matrices of visited graphs after burn-in.
`graph_weights`	vector which includes the waiting times of visited graphs after burn-in.
`all_graphs`	vector which includes the identity of the adjacency matrices for all iterations after burn-in. It is needed for monitoring the convergence of the BD-MCMC algorithm.
`all_weights`	vector which includes the waiting times for all iterations after burn-in. It is needed for monitoring the convergence of the BD-MCMC algorithm.

Author(s)

Reza Mohammadi a.mohammadi@uva.nl, Adrian Dobra, and Johan Pensar

References

Dobra, A. and Mohammadi, R. (2018). Loglinear Model Selection and Human Mobility, Annals of Applied Statistics, 12(2):815-845, doi:10.1214/18-AOAS1164

Mohammadi, A. and Wit, E. C. (2015). Bayesian Structure Learning in Sparse Gaussian Graphical Models, Bayesian Analysis, 10(1):109-138, doi:10.1214/14-BA889

Mohammadi, A. and Dobra, A. (2017). The R Package BDgraph for Bayesian Structure Learning in Graphical Models, ISBA Bulletin, 24(4):11-16

Pensar, J. et al (2017) Marginal pseudo-likelihood learning of discrete Markov network structures, Bayesian Analysis, 12(4):1195-215, doi:10.1214/16-BA1032

Mohammadi, R. and Wit, E. C. (2019). BDgraph: An R Package for Bayesian Structure Learning in Graphical Models, Journal of Statistical Software, 89(3):1-30, doi:10.18637/jss.v089.i03

Examples

# Generating multivariate normal data from a 'random' graph
data.sim <- bdgraph.sim( n = 70, p = 5, size = 7, vis = TRUE )
   
bdgraph.obj <- bdgraph.mpl( data = data.sim, iter = 500 )
  
summary( bdgraph.obj )
   
# To compare the result with true graph
compare( bdgraph.obj, data.sim, main = c( "Target", "BDgraph" ) )

[Package BDgraph version 2.72 Index]