learnBN {BiDAG} | R Documentation |
Bayesian network structure learning
Description
This function can be used finding the maximum a posteriori (MAP) DAG using stochastic search relying on MCMC schemes. Due to the superexponential size of the search space, it
must be reduced. By default the search space is limited to the skeleton found through the PC algorithm by means of conditional independence tests
(using the functions skeleton
and pc
from the ‘pcalg’ package [Kalisch et al, 2012]).
It is also possible to define an arbitrary search space by inputting an adjacency matrix, for example estimated by partial correlations or other network algorithms. Order MCMC scheme (algorithm="order"
)
performs the search of a maximum scoring order and selects a maximum scoring DAG from this order as MAP. To avoid discovering a suboptimal graph due to the absence
of some of the true positive edges in the search space, the function includes the possibility to expand the default or input search space, by allowing each node in the network to have one additional parent (plus1="TRUE"
).
This offers improvements in the learning of Bayesian networks. The iterative MCMC (algorithm="orderIter"
) scheme allows for iterative expansions of the search space.
This is useful in cases when the initial search space is poor in a sense that it contains only a limited number of true positive edges. Iterative expansions of the search space
efficiently solve this issue. However this scheme requires longer runtimes due to the need of running multiple consecutive MCMC chains.
This function is a wrapper for the individual structure learning functions that implement each of the described algorithms; for details see orderMCMC
,
and iterativeMCMC
.
Usage
learnBN(
scorepar,
algorithm = c("order", "orderIter"),
chainout = FALSE,
scoreout = ifelse(algorithm == "orderIter", TRUE, FALSE),
alpha = 0.05,
moveprobs = NULL,
iterations = NULL,
stepsave = NULL,
gamma = 1,
verbose = FALSE,
compress = TRUE,
startspace = NULL,
blacklist = NULL,
scoretable = NULL,
startpoint = NULL,
plus1 = TRUE,
iterpar = list(softlimit = 9, mergetype = "skeleton", accum = FALSE, plus1it = NULL,
addspace = NULL, alphainit = NULL),
cpdag = FALSE,
hardlimit = 12
)
Arguments
scorepar |
an object of class |
algorithm |
MCMC scheme to be used for MAP structure learning; possible options are "order" ( |
chainout |
logical, if TRUE the saved MCMC steps are returned, TRUE by default |
scoreout |
logical, if TRUE the search space and score tables are returned; FALSE by default for "order", TRUE for "orderIter" |
alpha |
numerical significance value in |
moveprobs |
a numerical vector of 4 (for "order" and "orderIter" algorithms) or 5 values (for "partition" algorithm) representing probabilities of the different moves in the space of
order and partitions accordingly. The moves are described in the corresponding algorithm specific functions |
iterations |
integer, the number of MCMC steps, the default value is |
stepsave |
integer, thinning interval for the MCMC chain, indicating the number of steps between two output iterations, the default is |
gamma |
tuning parameter which transforms the score by raising it to this power, 1 by default |
verbose |
logical, if TRUE messages about the algorithm's progress will be printed, FALSE by default |
compress |
logical, if TRUE adjacency matrices representing sampled graphs will be stored as a sparse Matrix (recommended); TRUE by default |
startspace |
(optional) a square sparse or ordinary matrix, of dimensions equal to the number of nodes, which defines the search space for the order MCMC in the form of an adjacency matrix. If NULL, the skeleton obtained from the PC-algorithm will be used. If |
blacklist |
(optional) a square sparse or ordinary matrix, of dimensions equal to the number of nodes, which defines edges to exclude from the search space. If |
scoretable |
(optional) object of class |
startpoint |
(optional) integer vector of length n (representing an order when |
plus1 |
logical, if TRUE (default) the search is performed on the extended search space; only changable for orderMCMC; for other algorithms is fixed to TRUE |
iterpar |
addition list of parameters for the MCMC scheme implemeting iterative expansions of the search space; for more details see |
cpdag |
logical, if TRUE the CPDAG returned by the PC algorithm will be used as the search space, if FALSE (default) the full undirected skeleton will be used as the search space |
hardlimit |
integer, limit on the size of parent sets in the search space; by default 14 when MAP=TRUE and 20 when MAP=FALSE |
Value
Depending on the value or the parameter algorithm
returns an object of class orderMCMC
or iterativeMCMC
which contains log-score trace of sampled DAGs as well
as adjacency matrix of the maximum scoring DAG(s), its score and the order or partition score. The output can optionally include DAGs sampled in MCMC iterations and the score tables.
Optional output is regulated by the parameters chainout
and scoreout
. See orderMCMC class
, iterativeMCMC class
for a detailed description of the classes' structures.
Note
see also extractor functions getDAG
, getTrace
, getSpace
, getMCMCscore
.
Author(s)
Polina Suter, Jack Kuipers, the code partly derived from the order MCMC implementation from Kuipers J, Moffa G (2017) <doi:10.1080/01621459.2015.1133426>
References
P. Suter, J. Kuipers, G. Moffa, N.Beerenwinkel (2023) <doi:10.18637/jss.v105.i09>
Friedman N and Koller D (2003). A Bayesian approach to structure discovery in bayesian networks. Machine Learning 50, 95-125.
Kalisch M, Maechler M, Colombo D, Maathuis M and Buehlmann P (2012). Causal inference using graphical models with the R package pcalg. Journal of Statistical Software 47, 1-26.
Geiger D and Heckerman D (2002). Parameter priors for directed acyclic graphical models and the characterization of several probability distributions. The Annals of Statistics 30, 1412-1440.
Kuipers J, Moffa G and Heckerman D (2014). Addendum on the scoring of Gaussian acyclic graphical models. The Annals of Statistics 42, 1689-1691.
Spirtes P, Glymour C and Scheines R (2000). Causation, Prediction, and Search, 2nd edition. The MIT Press.
Examples
## Not run:
myScore<-scoreparameters("bge",Boston)
mapfit<-learnBN(myScore,"orderIter")
summary(mapfit)
plot(mapfit)
## End(Not run)