R: Structure learning with an iterative order MCMC algorithm on...

iterativeMCMC {BiDAG}

R Documentation

Structure learning with an iterative order MCMC algorithm on an expanded search space

Description

This function implements an iterative search for the maximum a posteriori (MAP) DAG, by means of order MCMC (arXiv:1803.07859v3). At each iteration, the current search space is expanded by allowing each node to have up to one additional parent not already included in the search space. By default the initial search space is obtained through the PC-algorithm (using the functions skeleton and pc from the ‘pcalg’ package [Kalisch et al, 2012]). At each iteration order MCMC is employed to search for the MAP DAG. The edges in the MAP DAG are added to the initial search space to provide the search space for the next iteration. The algorithm iterates until no further score improvements can be achieved by expanding the search space. The final search space may be used for the sampling versions of orderMCMC and partitionMCMC.

Usage

iterativeMCMC(
  scorepar,
  MAP = TRUE,
  posterior = 0.5,
  softlimit = 9,
  hardlimit = 12,
  alpha = 0.05,
  gamma = 1,
  verbose = TRUE,
  chainout = FALSE,
  scoreout = FALSE,
  cpdag = FALSE,
  mergetype = "skeleton",
  iterations = NULL,
  moveprobs = NULL,
  stepsave = NULL,
  startorder = NULL,
  accum = FALSE,
  compress = TRUE,
  plus1it = NULL,
  startspace = NULL,
  blacklist = NULL,
  addspace = NULL,
  scoretable = NULL,
  alphainit = NULL
)

## S3 method for class 'iterativeMCMC'
plot(
  x,
  ...,
  main = "iterative MCMC, DAG scores",
  xlab = "MCMC step",
  ylab = "DAG logscore",
  type = "l",
  col = "blue"
)

## S3 method for class 'iterativeMCMC'
print(x, ...)

## S3 method for class 'iterativeMCMC'
summary(object, ...)

Arguments

`scorepar`	an object of class `scoreparameters`, containing the data and scoring parameters; see constructor function `scoreparameters`
`MAP`	logical, if TRUE (default) the search targets the MAP DAG (a DAG with maximum score), if FALSE at each MCMC step a DAG is sampled from the order proportionally to its score; when expanding a search space when MAP=TRUE all edges from the maximum scoring DAG are added to the new space, when MAP=FALSE only edges with posterior probability higher than defined by parameter `posterior` are added to the search space
`posterior`	logical, when `MAP` set to FALSE defines posterior probability threshold for adding the edges to the search space
`softlimit`	integer, limit on the size of parent sets beyond which adding undirected edges is restricted; below this limit edges are added to expand the parent sets based on the undirected skeleton of the MAP DAG (or from its CPDAG, depending on the parameter `mergecp`), above the limit only the directed edges are added from the MAP DAG; the limit is 9 by default
`hardlimit`	integer, limit on the size of parent sets beyond which the search space is not further expanded to prevent long runtimes; the limit is 12 by default
`alpha`	numerical significance value in `{0,1}` for the conditional independence tests in the PC-stage
`gamma`	tuning parameter which transforms the score by raising it to this power, 1 by default
`verbose`	logical, if TRUE (default) prints messages on the progress of execution
`chainout`	logical, if TRUE the saved MCMC steps are returned, FALSE by default
`scoreout`	logical, if TRUE the search space from the last plus1 iterations and the corresponding score tables are returned, FALSE by default
`cpdag`	logical, if set to TRUE the equivalence class (CPDAG) found by the PC algorithm is used as a search space, when FALSE (default) the undirected skeleton used as a search space
`mergetype`	defines which edges are added to the search space at each expansion iteration; three options are available 'dag', 'cpdag', 'skeleton'; 'skeleton' by default
`iterations`	integer, the number of MCMC steps, the default value is `3.5n^{2}\log{n}`
`moveprobs`	a numerical vector of 4 values in `{0,1}` corresponding to the probabilities of the following MCMC moves in the order space: exchanging 2 random nodes in the order exchanging 2 adjacent nodes in the order placing a single node elsewhere in the order staying still
`stepsave`	integer, thinning interval for the MCMC chain, indicating the number of steps between two output iterations, the default is `iterations`/1000
`startorder`	integer vector of length n, which will be used as the starting order in the MCMC algorithm, the default order is random
`accum`	logical, when TRUE at each search step expansion new edges are added to the current search space; when FALSE (default) the new edges are added to the starting space
`compress`	logical, if TRUE adjacency matrices representing sampled graphs will be stored as a sparse Matrix (recommended); TRUE by default
`plus1it`	(optional) integer, a number of iterations of search space expansion; by default the algorithm iterates until no score improvement can be achieved by further expanding the search space
`startspace`	(optional) a square matrix, of dimensions equal to the number of nodes, which defines the search space for the order MCMC in the form of an adjacency matrix; if NULL, the skeleton obtained from the PC-algorithm will be used; if `startspace[i,j]` equals to 1 (0) it means that the edge from node `i` to node `j` is included (excluded) from the search space; to include an edge in both directions, both `startspace[i,j]` and `startspace[j,i]` should be 1
`blacklist`	(optional) a square matrix, of dimensions equal to the number of nodes, which defines edges to exclude from the search space; if `blacklist[i,j]` equals to 1 it means that the edge from node `i` to node `j` is excluded from the search space "dag", then edges from maximum scoring DAG are added; "cpdag", then the maximum scoring DAG is first converted to the CPDAG, from which all edges are added to the search space; "skeleton", then the maximum scoring DAG is first converted to the skeleton, from which all edges are added to the search space
`addspace`	(optional) a square matrix, of dimensions equal to the number of nodes, which defines the edges, which are added at to the search space only at the first iteration of iterative seach and do not necessarily stay afterwards; defined in the form of an adjacency matrix; if `addspace[i,j]` equals to 1 (0) it means that the edge from node `i` to node `j` is included (excluded) from the search space; to include an edge in both directions, both `addspace[i,j]` and `addspace[j,i]` should be 1
`scoretable`	(optional) object of class `scorespace`. When not NULL, parameters `startspace` and `addspace` are ignored.
`alphainit`	(optional) numerical, defines alpha that is used by the PC algorithm to learn initial structure of a DBN, ignored in static case
`x`	object of class 'iterativeMCMC'
`...`	ignored
`main`	name of the graph; "iterative MCMC, DAG scores" by default
`xlab`	name of x-axis; "MCMC step"
`ylab`	name of y-axis; "DAG logscore"
`type`	type of line in the plot; "l" by default
`col`	colour of line in the plot; "blue" by default
`object`	object of class 'iterativeMCMC'

Value

Object of class iterativeMCMC, which contains log-score trace as well as adjacency matrix of the maximum scoring DAG, its score and the order score. The output can optionally include DAGs sampled in MCMC iterations and the score tables. Optional output is regulated by the parameters chainout and scoreout. See iterativeMCMC class for a detailed class structure.

Note

see also extractor functions getDAG, getTrace, getSpace, getMCMCscore.

Author(s)

Polina Suter, Jack Kuipers

References

P. Suter, J. Kuipers, G. Moffa, N.Beerenwinkel (2023) <doi:10.18637/jss.v105.i09>

Kuipers J, Super P and Moffa G (2020). Efficient Sampling and Structure Learning of Bayesian Networks. (arXiv:1803.07859v3)

Friedman N and Koller D (2003). A Bayesian approach to structure discovery in bayesian networks. Machine Learning 50, 95-125.

Kalisch M, Maechler M, Colombo D, Maathuis M and Buehlmann P (2012). Causal inference using graphical models with the R package pcalg. Journal of Statistical Software 47, 1-26.

Geiger D and Heckerman D (2002). Parameter priors for directed acyclic graphical models and the characterization of several probability distributions. The Annals of Statistics 30, 1412-1440.

Kuipers J, Moffa G and Heckerman D (2014). Addendum on the scoring of Gaussian directed acyclic graphical models. The Annals of Statistics 42, 1689-1691.

Spirtes P, Glymour C and Scheines R (2000). Causation, Prediction, and Search, 2nd edition. The MIT Press.

Examples

## Not run: 
Bostonpar<-scoreparameters("bge",Boston)
itfit<-iterativeMCMC(Bostonpar, chainout=TRUE, scoreout=TRUE)
plot(itfit)

## End(Not run)

[Package BiDAG version 2.1.4 Index]