R: Structure learning from missing data

structural.em {bnlearn}

R Documentation

Structure learning from missing data

Description

Learn the structure of a Bayesian network from a data set containing missing values using Structural EM.

Usage

structural.em(x, maximize = "hc", maximize.args = list(), fit,
    fit.args = list(), impute, impute.args = list(), return.all = FALSE,
    start = NULL, max.iter = 5, debug = FALSE)

Arguments

`x`	a data frame containing the variables in the model.
`maximize`	a character string, the score-based algorithm to be used in the “maximization” step. See `structure learning` for details.
`maximize.args`	a list of arguments to be passed to the algorithm specified by `maximize`, such as `restart` for hill-climbing or `tabu` for tabu search.
`fit`	a character string, the parameter learning method to be used in the “maximization” step. See `bn.fit` for details.
`fit.args`	a list of arguments to be passed to the parameter learning method specified by `fit`.
`impute`	a character string, the imputation method to be used in the “expectation” step. See `impute` for details.
`impute.args`	a list of arguments to be passed to the imputation method specified by `impute`.
`return.all`	a boolean value. See below for details.
`start`	a `bn` or `bn.fit` object, the network used to perform the first imputation and as a starting point for the score-based algorithm specified by `maximize`.
`max.iter`	an integer, the maximum number of iterations.
`debug`	a boolean value. If `TRUE` a lot of debugging output is printed; otherwise the function is completely silent.

Value

If return.all is FALSE, structural.em() returns an object of class bn. (See bn-class for details.)

If return.all is TRUE, structural.em() returns a list with three elements named dag (an object of class bn), imputed (a data frame containing the imputed data from the last iteration) and fitted (an object of class bn.fit, again from the last iteration; see bn.fit-class for details).

Note

If at least one of the variables in the data x does not contain any observed value, the start network must be specified and it must be a bn.fit object. Otherwise, structural.em() is unable to complete the first maximization step because it cannot fit the corresponding local distribution(s).

Note that if impute is set to bayes-lw, each call to structural.em may produce a different model since the imputation is based on a stochastic simulation.

Author(s)

Marco Scutari

References

Friedman N (1997). "Learning Belief Networks in the Presence of Missing Values and Hidden Variables". Proceedings of the 14th International Conference on Machine Learning, 125–133.

Examples

data(learning.test)

# learn with incomplete data.
incomplete.data = learning.test
incomplete.data[1:100, 1] = NA
incomplete.data[101:200, 2] = NA
incomplete.data[1:200, 5] = NA
structural.em(incomplete.data)

## Not run: 
# learn with a latent variable.
incomplete.data = learning.test
incomplete.data[seq(nrow(incomplete.data)), 1] = NA
start = bn.fit(empty.graph(names(learning.test)), learning.test)
wl = data.frame(from = c("A", "A"), to = c("B", "D"))
structural.em(incomplete.data, start = start,
  maximize.args = list(whitelist = wl))

## End(Not run)

[Package bnlearn version 5.0 Index]