mlNaiveBayes {mlearning} | R Documentation |
Supervised classification using naive Bayes
Description
Unified (formula-based) interface version of the naive Bayes algorithm
provided by e1071::naiveBayes()
.
Usage
mlNaiveBayes(train, ...)
ml_naive_bayes(train, ...)
## S3 method for class 'formula'
mlNaiveBayes(formula, data, laplace = 0, ..., subset, na.action)
## Default S3 method:
mlNaiveBayes(train, response, laplace = 0, ...)
## S3 method for class 'mlNaiveBayes'
predict(
object,
newdata,
type = c("class", "membership", "both"),
method = c("direct", "cv"),
na.action = na.exclude,
threshold = 0.001,
eps = 0,
...
)
Arguments
train |
a matrix or data frame with predictors. |
... |
further arguments passed to the classification method or its
|
formula |
a formula with left term being the factor variable to predict
and the right term with the list of independent, predictive variables,
separated with a plus sign. If the data frame provided contains only the
dependent and independent variables, one can use the |
data |
a data.frame to use as a training set. |
laplace |
positive number controlling Laplace smoothing for the naive Bayes classifier. The default (0) disables Laplace smoothing. |
subset |
index vector with the cases to define the training set in use (this argument must be named, if provided). |
na.action |
function to specify the action to be taken if |
response |
a vector of factor with the classes. |
object |
an mlNaiveBayes object |
newdata |
a new dataset with same conformation as the training set (same variables, except may by the class for classification or dependent variable for regression). Usually a test set, or a new dataset to be predicted. |
type |
the type of prediction to return. |
method |
|
threshold |
value replacing cells with probabilities within 'eps' range. |
eps |
number for specifying an epsilon-range to apply Laplace smoothing (to replace zero or close-zero probabilities by 'threshold'). |
Value
ml_naive_bayes()
/mlNaiveBayes()
creates an mlNaiveBayes,
mlearning object containing the classifier and a lot of additional
metadata used by the functions and methods you can apply to it like
predict()
or cvpredict()
. In case you want to program new functions or
extract specific components, inspect the "unclassed" object using
unclass()
.
See Also
mlearning()
, cvpredict()
, confusion()
, also
e1071::naiveBayes()
that actually does the classification.
Examples
# Prepare data: split into training set (2/3) and test set (1/3)
data("iris", package = "datasets")
train <- c(1:34, 51:83, 101:133)
iris_train <- iris[train, ]
iris_test <- iris[-train, ]
# One case with missing data in train set, and another case in test set
iris_train[1, 1] <- NA
iris_test[25, 2] <- NA
iris_nb <- ml_naive_bayes(data = iris_train, Species ~ .)
summary(iris_nb)
predict(iris_nb) # Default type is class
predict(iris_nb, type = "membership")
predict(iris_nb, type = "both")
# Self-consistency, do not use for assessing classifier performances!
confusion(iris_nb)
# Use an independent test set instead
confusion(predict(iris_nb, newdata = iris_test), iris_test$Species)
# Another dataset
data("HouseVotes84", package = "mlbench")
house_nb <- ml_naive_bayes(data = HouseVotes84, Class ~ .,
na.action = na.omit)
summary(house_nb)
confusion(house_nb) # Self-consistency
confusion(cvpredict(house_nb), na.omit(HouseVotes84)$Class)