learn_params {bnclassify}R Documentation

Learn the parameters of a Bayesian network structure.

Description

Learn parameters with maximum likelihood or Bayesian estimation, the weighting attributes to alleviate naive bayes' independence assumption (WANBIA), attribute weighted naive Bayes (AWNB), or the model averaged naive Bayes (MANB) methods. Returns a bnc_bn.

Usage

lp(
  x,
  dataset,
  smooth,
  awnb_trees = NULL,
  awnb_bootstrap = NULL,
  manb_prior = NULL,
  wanbia = NULL
)

Arguments

x

The bnc_dag object. The Bayesian network classifier structure.

dataset

The data frame from which to learn network parameters.

smooth

A numeric. The smoothing value (\alpha) for Bayesian parameter estimation. Nonnegative.

awnb_trees

An integer. The number (M) of bootstrap samples to generate.

awnb_bootstrap

A numeric. The size of the bootstrap subsample, relative to the size of dataset (given in [0,1]).

manb_prior

A numeric. The prior probability for an arc between the class and any feature.

wanbia

A logical. If TRUE, WANBIA feature weighting is performed.

Details

lp learns the parameters of each local distribution \theta_{ijk} = P(X_i = k \mid \mathbf{Pa}(X_i) = j) as

\theta_{ijk} = \frac{N_{ijk} + \alpha}{N_{ ij \cdot } + r_i \alpha},

where N_{ijk} is the number of instances in dataset in which X_i = k and \mathbf{Pa}(X_i) = j, N_{ ij \cdot} = \sum_{k=1}^{r_i} N_{ijk}, r_i is the cardinality of X_i, and all hyperparameters of the Dirichlet prior equal to \alpha. \alpha = 0 corresponds to maximum likelihood estimation. Returns a uniform distribution when N_{ i j \cdot } + r_i \alpha = 0. With partially observed data, the above amounts to available case analysis.

WANBIA learns a unique exponent 'weight' per feature. They are computed by optimizing conditional log-likelihood, and are bounded with all w_i \in [0, 1]. For WANBIA estimates, set wanbia to TRUE.

In order to get the AWNB parameter estimate, provide either the awnb_bootstrap and/or the awnb_trees argument. The estimate is:

\theta_{ijk}^{AWNB} = \frac{\theta_{ijk}^{w_i}}{\sum_{k=1}^{r_i} \theta_{ijk}^{w_i}},

while the weights w_i are computed as

w_i = \frac{1}{M}\sum_{t=1}^M \sqrt{\frac{1}{d_{ti}}},

where M is the number of bootstrap samples from dataset and d_{ti} the minimum testing depth of X_i in an unpruned classification tree learned from the t-th subsample (d_{ti} = 0 if X_i is omitted from t-th tree).

The MANB parameters correspond to Bayesian model averaging over the naive Bayes models obtained from all 2^n subsets over the n features. To get MANB parameters, provide the manb_prior argument.

Value

A bnc_bn object.

References

Hall M (2004). A decision tree-based attribute weighting filter for naive Bayes. Knowledge-based Systems, 20(2), 120-126.

Dash D and Cooper GF (2002). Exact model averaging with naive Bayesian classifiers. 19th International Conference on Machine Learning (ICML-2002), 91-98.

Pigott T D (2001) A review of methods for missing data. Educational research and evaluation, 7(4), 353-383.

Examples

data(car)
nb <- nb('class', car)
# Maximum likelihood estimation
mle <- lp(nb, car, smooth = 0)
# Bayesian estimaion
bayes <- lp(nb, car, smooth = 0.5)
# MANB
manb <- lp(nb, car, smooth = 0.5, manb_prior = 0.5)
# AWNB
awnb <- lp(nb, car, smooth = 0.5, awnb_trees = 10)

[Package bnclassify version 0.4.8 Index]