bayesInferSimple {mmb} | R Documentation |
Perform simple (network) Bayesian inferencing and regression.
Description
Uses simple Bayesian inference to determine the probability or relative
likelihood of a given value. This function can also regress to the most
likely value instead. Simple means that segmented data is used in a way
that is equal to how a Bayesian network works. For a finite set of labels,
this function needs to be called for each, to obtain the probability of
each label (or, for n-1 labels or until a label with >.5 probability is
found). For obtaining the probability of a continuous value, this function
is useful for deciding between picking among a finite set of values. The
empirical CDF may be used to obtain an actual probability for a given
continuous value, otherwise, the empirical PDF is estimated and a relative
likelihood is returned. For regression, set doRegress = TRUE
to
obtain the most likely value of the target feature, instead of obtaining
its relative likelihood.
Usage
bayesInferSimple(
df,
features,
targetCol,
selectedFeatureNames = c(),
retainMinValues = 1,
doRegress = FALSE,
doEcdf = FALSE,
regressor = NULL
)
Arguments
df |
data.frame |
features |
data.frame with bayes-features. One of the features needs to be the label-column. |
targetCol |
string with the name of the feature that represents the label. |
selectedFeatureNames |
vector default |
retainMinValues |
integer to require a minimum amount of data points when segmenting the data feature by feature. |
doRegress |
default FALSE a boolean to indicate whether to do a regression instead of returning the relative likelihood of a continuous feature. If the target feature is discrete and regression is requested, will issue a warning. |
doEcdf |
default FALSE a boolean to indicate whether to use the empirical CDF to return a probability when inferencing a continuous feature. If false, uses the empirical PDF to return the rel. likelihood. This parameter does not have any effect when inferring discrete values or when doing a regression. |
regressor |
Function that is given the collected values for regression and thus finally used to select a most likely value. Defaults to the built-in estimator for the empirical PDF and returns its argmax. However, any other function can be used, too, such as min, max, median, average etc. You may also use this function to obtain the raw values for further processing. This function is ignored if not doing regression. |
Value
numeric probability (inferring discrete labels) or relative likelihood (regression, inferring likelihood of continuous value) or most likely value given the conditional features.
Author(s)
Sebastian Hönel sebastian.honel@lnu.se
References
Scutari M (2010). “Learning Bayesian Networks with the bnlearn R Package.” Journal of Statistical Software, 35(3), 1–22. doi: 10.18637/jss.v035.i03.
Examples
feat1 <- mmb::createFeatureForBayes(
name = "Petal.Length", value = mean(iris$Petal.Length))
feat2 <- mmb::createFeatureForBayes(
name = "Petal.Width", value = mean(iris$Petal.Width))
featT <- mmb::createFeatureForBayes(
name = "Species", iris[1,]$Species, isLabel = TRUE)
# Infer likelihood of featT's label:
feats <- rbind(feat1, feat2, featT)
mmb::bayesInferSimple(df = iris, features = feats, targetCol = featT$name)
# Infer likelihood of feat1's value:
featT$isLabel = FALSE
feat1$isLabel = TRUE
# We do not bind featT this time:
feats <- rbind(feat1, feat2)
mmb::bayesInferSimple(df = iris, features = feats, targetCol = feat1$name)