bdgraph.mpl {BDgraph}R Documentation

Search algorithm in graphical models using marginal pseudo-likehlihood

Description

This function consists of several sampling algorithms for Bayesian model determination in undirected graphical models based on mariginal pseudo-likelihood. To speed up the computations, the birth-death MCMC sampling algorithms are implemented in parallel using OpenMP in C++.

Usage

bdgraph.mpl( data, n = NULL, method = "ggm", transfer = TRUE, 
             algorithm = "bdmcmc", iter = 5000, burnin = iter / 2, 
             g.prior = 0.2, g.start = "empty", 
             jump = NULL, alpha = 0.5, save = FALSE, 
             cores = NULL, operator = "or", verbose = TRUE )

Arguments

data

there are two options: (1) an (n×pn \times p) matrix or a data.frame corresponding to the data, (2) an (p×pp \times p) covariance matrix as S=XXS=X'X which XX is the data matrix (nn is the sample size and pp is the number of variables). It also could be an object of class "sim", from function bdgraph.sim. The input matrix is automatically identified by checking the symmetry.

n

number of observations. It is needed if the "data" is a covariance matrix.

method

character with two options "ggm" (default), "dgm" and "dgm-binary". Option "ggm" is for Gaussian graphical models based on Gaussianity assumption. Option "dgm" is for discrete graphical models for the count data. Option "dgm-binary" is for discrete graphical models for the data that are binary.

transfer

for only 'count' data which method = "dgm" or method = "dgm-binary".

algorithm

character with two options "bdmcmc" (default) and "rjmcmc". Option "bdmcmc" is based on birth-death MCMC algorithm. Option "rjmcmc" is based on reverible jump MCMC algorithm. Option "hc" is based on hill-climbing algorithm; this algorithm is only for count data which method = "dgm" or method = "dgm-binary".

iter

number of iteration for the sampling algorithm.

burnin

number of burn-in iteration for the sampling algorithm.

g.prior

for determining the prior distribution of each edge in the graph. There are two options: a single value between 00 and 11 (e.g. 0.50.5 as a noninformative prior) or an (p×pp \times p) matrix with elements between 00 and 11.

g.start

corresponds to a starting point of the graph. It could be an (p×pp \times p) matrix, "empty" (default), or "full". Option "empty" means the initial graph is an empty graph and "full" means a full graph. It also could be an object with S3 class "bdgraph" of R package BDgraph or the class "ssgraph" of R package ssgraph::ssgraph(); this option can be used to run the sampling algorithm from the last objects of previous run (see examples).

jump

it is only for the BDMCMC algorithm (algorithm = "bdmcmc"). It is for simultaneously updating multiple links at the same time to update graph in the BDMCMC algorithm.

alpha

value of the hyper parameter of Dirichlet, which is a prior distribution.

save

logical: if FALSE (default), the adjacency matrices are NOT saved. If TRUE, the adjacency matrices after burn-in are saved.

cores

number of cores to use for parallel execution. The case cores = "all" means all CPU cores to use for parallel execution.

operator

character with two options "or" (default) and "and". It is for hill-climbing algorithm.

verbose

logical: if TRUE (default), report/print the MCMC running time.

Value

An object with S3 class "bdgraph" is returned:

p_links

upper triangular matrix which corresponds the estimated posterior probabilities of all possible links.

For the case "save = TRUE" is returned:

sample_graphs

vector of strings which includes the adjacency matrices of visited graphs after burn-in.

graph_weights

vector which includes the waiting times of visited graphs after burn-in.

all_graphs

vector which includes the identity of the adjacency matrices for all iterations after burn-in. It is needed for monitoring the convergence of the BD-MCMC algorithm.

all_weights

vector which includes the waiting times for all iterations after burn-in. It is needed for monitoring the convergence of the BD-MCMC algorithm.

Author(s)

Reza Mohammadi a.mohammadi@uva.nl, Adrian Dobra, and Johan Pensar

References

Dobra, A. and Mohammadi, R. (2018). Loglinear Model Selection and Human Mobility, Annals of Applied Statistics, 12(2):815-845, doi:10.1214/18-AOAS1164

Mohammadi, A. and Wit, E. C. (2015). Bayesian Structure Learning in Sparse Gaussian Graphical Models, Bayesian Analysis, 10(1):109-138, doi:10.1214/14-BA889

Mohammadi, A. and Dobra, A. (2017). The R Package BDgraph for Bayesian Structure Learning in Graphical Models, ISBA Bulletin, 24(4):11-16

Pensar, J. et al (2017) Marginal pseudo-likelihood learning of discrete Markov network structures, Bayesian Analysis, 12(4):1195-215, doi:10.1214/16-BA1032

Mohammadi, R. and Wit, E. C. (2019). BDgraph: An R Package for Bayesian Structure Learning in Graphical Models, Journal of Statistical Software, 89(3):1-30, doi:10.18637/jss.v089.i03

See Also

bdgraph, bdgraph.dw, bdgraph.sim, summary.bdgraph, compare

Examples

# Generating multivariate normal data from a 'random' graph
data.sim <- bdgraph.sim( n = 70, p = 5, size = 7, vis = TRUE )
   
bdgraph.obj <- bdgraph.mpl( data = data.sim, iter = 500 )
  
summary( bdgraph.obj )
   
# To compare the result with true graph
compare( bdgraph.obj, data.sim, main = c( "Target", "BDgraph" ) )

[Package BDgraph version 2.72 Index]