bdgraph.sim {BDgraph}R Documentation

Graph data simulation

Description

Simulating multivariate distributions with different types of underlying graph structures, including "random", "cluster", "scale-free", "lattice", "hub", "star", "circle", "AR(1)", and "AR(2)". Based on the underling graph structure, it generates different types of multivariate data, including multivariate Gaussian, non-Gaussian, count, mixed, binary, or discrete Weibull data. This function can be used also for only simulating graphs by option n=0, as a default.

Usage

bdgraph.sim( p = 10, graph = "random", n = 0, type = "Gaussian", prob = 0.2, 
             size = NULL, mean = 0, class = NULL, cut = 4, b = 3,
             D = diag( p ), K = NULL, sigma = NULL, 
             q = exp(-1), beta = 1, vis = FALSE )

Arguments

p

The number of variables (nodes).

graph

The graph structure with options "random", "cluster", "scale-free", "lattice", "hub", "star", "circle", "AR(1)", and "AR(2)". It also could be an adjacency matrix corresponding to a graph structure (an upper triangular matrix in which g_{ij}=1 if there is a link between notes i and j, otherwise g_{ij}=0).

n

The number of samples required. Note that for the case n = 0, only graph is generated.

type

Type of data with four options "Gaussian" (default), "non-Gaussian", "count", "mixed", "binary", and "dw". For the option "Gaussian", data are generated from multivariate normal distribution. For the option "non-Gaussian", data are transfered multivariate normal distribution to continuous multivariate non-Gaussian distribution. For the option "count", data are transfered from multivariate normal distribution to multivariate count data. For the option "mixed", data are transfered from multivariate normal distribution to mixture of 'count', 'ordinal', 'non-Gaussian', 'binary' and 'Gaussian', respectively. For the option "binary", data are generated directly from the joint distribution, in this case p must be less than 17. For the option "dw", data are transfered from multivariate normal distribution to the discrete Weibull distribution with parameters q and beta.

prob

If graph="random", it is the probability that a pair of nodes has a link.

size

The number of links in the true graph (graph size).

mean

A vector specifies the mean of the variables.

class

If graph="cluster", it is the number of classes.

cut

If type="count", it is the number of categories for simulating count data.

b

The degree of freedom for G-Wishart distribution, W_G(b, D).

D

The positive definite (p \times p) "scale" matrix for G-Wishart distribution, W_G(b, D). The default is an identity matrix.

K

If graph="fixed", it is a positive-definite symmetric matrix specifies as a true precision matrix.

sigma

If graph="fixed", it is a positive-definite symmetric matrix specifies as a true covariance matrix.

q, beta

If type="dw", they are the parameters of the discrete Weibull distribution with density

p(x,q,β) = q^{x^{β}}-q^{(x+1)^{β}}, \quad \forall x = \{ 0, 1, 2, … \}.

vis

Visualize the true graph structure.

Value

An object with S3 class "sim" is returned:

data

Generated data as an (n x p) matrix.

sigma

The covariance matrix of the generated data.

K

The precision matrix of the generated data.

G

The adjacency matrix corresponding to the true graph structure.

Author(s)

Reza Mohammadi a.mohammadi@uva.nl, Pariya Behrouzi, Veronica Vinciotti, and Ernst Wit

References

Mohammadi, R. and Wit, E. C. (2019). BDgraph: An R Package for Bayesian Structure Learning in Graphical Models, Journal of Statistical Software, 89(3):1-30

Mohammadi, A. and Wit, E. C. (2015). Bayesian Structure Learning in Sparse Gaussian Graphical Models, Bayesian Analysis, 10(1):109-138

Mohammadi, A. et al (2017). Bayesian modelling of Dupuytren disease by using Gaussian copula graphical models, Journal of the Royal Statistical Society: Series C, 66(3):629-645

Dobra, A. and Mohammadi, R. (2018). Loglinear Model Selection and Human Mobility, Annals of Applied Statistics, 12(2):815-845

Letac, G., Massam, H. and Mohammadi, R. (2018). The Ratio of Normalizing Constants for Bayesian Graphical Gaussian Model Selection, arXiv preprint arXiv:1706.04416v2

Pensar, J. et al (2017) Marginal pseudo-likelihood learning of discrete Markov network structures, Bayesian Analysis, 12(4):1195-215

See Also

graph.sim, bdgraph, bdgraph.mpl

Examples

## Not run: 
# Generating multivariate normal data from a 'random' graph
data.sim <- bdgraph.sim( p = 10, n = 50, prob = 0.3, vis = TRUE )
print( data.sim )
     
# Generating multivariate normal data from a 'hub' graph
data.sim <- bdgraph.sim( p = 6, n = 3, graph = "hub", vis = FALSE )
round( data.sim $ data, 2 )
     
# Generating mixed data from a 'hub' graph 
data.sim <- bdgraph.sim( p = 8, n = 10, graph = "hub", type = "mixed" )
round( data.sim $ data, 2 )

# Generating only a 'scale-free' graph (with no data) 
graph.sim <- bdgraph.sim( p = 8, graph = "scale-free" )
plot( graph.sim )
graph.sim $ G

## End(Not run)

[Package BDgraph version 2.64 Index]