bdgraph.mpl {BDgraph}R Documentation

Search algorithm in graphical models using marginal pseudo-likehlihood

Description

This function consists of several sampling algorithms for Bayesian model determination in undirected graphical models based on mariginal pseudo-likelihood. To speed up the computations, the birth-death MCMC sampling algorithms are implemented in parallel using OpenMP in C++.

Usage

bdgraph.mpl( data, n = NULL, method = "ggm", transfer = TRUE, 
             algorithm = "bdmcmc", iter = 5000, burnin = iter / 2, 
             g.prior = 0.2, g.start = "empty", 
             jump = NULL, alpha = 0.5, save = FALSE, 
             cores = NULL, operator = "or", verbose = TRUE )

Arguments

data

there are two options: (1) an (n \times p) matrix or a data.frame corresponding to the data, (2) an (p \times p) covariance matrix as S=X'X which X is the data matrix (n is the sample size and p is the number of variables). It also could be an object of class "sim", from function bdgraph.sim. The input matrix is automatically identified by checking the symmetry.

n

number of observations. It is needed if the "data" is a covariance matrix.

method

character with two options "ggm" (default), "dgm" and "dgm-binary". Option "ggm" is for Gaussian graphical models based on Gaussianity assumption. Option "dgm" is for discrete graphical models for the count data. Option "dgm-binary" is for discrete graphical models for the data that are binary.

transfer

for only 'count' data which method = "dgm" or method = "dgm-binary".

algorithm

character with two options "bdmcmc" (default) and "rjmcmc". Option "bdmcmc" is based on birth-death MCMC algorithm. Option "rjmcmc" is based on reverible jump MCMC algorithm. Option "hc" is based on hill-climbing algorithm; this algorithm is only for count data which method = "dgm" or method = "dgm-binary".

iter

number of iteration for the sampling algorithm.

burnin

number of burn-in iteration for the sampling algorithm.

g.prior

for determining the prior distribution of each edge in the graph. There are two options: a single value between 0 and 1 (e.g. 0.5 as a noninformative prior) or an (p \times p) matrix with elements between 0 and 1.

g.start

corresponds to a starting point of the graph. It could be an (p \times p) matrix, "empty" (default), or "full". Option "empty" means the initial graph is an empty graph and "full" means a full graph. It also could be an object with S3 class "bdgraph" of R package BDgraph or the class "ssgraph" of R package ssgraph::ssgraph(); this option can be used to run the sampling algorithm from the last objects of previous run (see examples).

jump

it is only for the BDMCMC algorithm (algorithm = "bdmcmc"). It is for simultaneously updating multiple links at the same time to update graph in the BDMCMC algorithm.

alpha

value of the hyper parameter of Dirichlet, which is a prior distribution.

save

logical: if FALSE (default), the adjacency matrices are NOT saved. If TRUE, the adjacency matrices after burn-in are saved.

cores

number of cores to use for parallel execution. The case cores = "all" means all CPU cores to use for parallel execution.

operator

character with two options "or" (default) and "and". It is for hill-climbing algorithm.

verbose

logical: if TRUE (default), report/print the MCMC running time.

Value

An object with S3 class "bdgraph" is returned:

p_links

upper triangular matrix which corresponds the estimated posterior probabilities of all possible links.

For the case "save = TRUE" is returned:

sample_graphs

vector of strings which includes the adjacency matrices of visited graphs after burn-in.

graph_weights

vector which includes the waiting times of visited graphs after burn-in.

all_graphs

vector which includes the identity of the adjacency matrices for all iterations after burn-in. It is needed for monitoring the convergence of the BD-MCMC algorithm.

all_weights

vector which includes the waiting times for all iterations after burn-in. It is needed for monitoring the convergence of the BD-MCMC algorithm.

Author(s)

Reza Mohammadi a.mohammadi@uva.nl, Adrian Dobra, and Johan Pensar

References

Dobra, A. and Mohammadi, R. (2018). Loglinear Model Selection and Human Mobility, Annals of Applied Statistics, 12(2):815-845, doi:10.1214/18-AOAS1164

Mohammadi, A. and Wit, E. C. (2015). Bayesian Structure Learning in Sparse Gaussian Graphical Models, Bayesian Analysis, 10(1):109-138, doi:10.1214/14-BA889

Mohammadi, A. and Dobra, A. (2017). The R Package BDgraph for Bayesian Structure Learning in Graphical Models, ISBA Bulletin, 24(4):11-16

Pensar, J. et al (2017) Marginal pseudo-likelihood learning of discrete Markov network structures, Bayesian Analysis, 12(4):1195-215, doi:10.1214/16-BA1032

Mohammadi, R. and Wit, E. C. (2019). BDgraph: An R Package for Bayesian Structure Learning in Graphical Models, Journal of Statistical Software, 89(3):1-30, doi:10.18637/jss.v089.i03

See Also

bdgraph, bdgraph.dw, bdgraph.sim, summary.bdgraph, compare

Examples

# Generating multivariate normal data from a 'random' graph
data.sim <- bdgraph.sim( n = 70, p = 5, size = 7, vis = TRUE )
   
bdgraph.obj <- bdgraph.mpl( data = data.sim, iter = 500 )
  
summary( bdgraph.obj )
   
# To compare the result with true graph
compare( bdgraph.obj, data.sim, main = c( "Target", "BDgraph" ) )

[Package BDgraph version 2.72 Index]