nbc {nbc4va}R Documentation

Train a NBC model

Description

Performs supervised Naive Bayes Classification on verbal autopsy data.

Usage

nbc(train, test, known = TRUE)

Arguments

train

Dataframe of verbal autopsy train data (See Data documentation).

  • Columns (in order): ID, Cause, Symptom-1 to Symptom-n..

  • ID (vectorof char): unique case identifiers

  • Cause (vectorof char): observed causes for each case

  • Symptom-n.. (vectorsof (1 OR 0)): 1 for presence, 0 for absence, other values are treated as unknown

  • Unknown symptoms are imputed randomly from distributions of 1s and 0s per symptom column; if no 1s or 0s exist then the column is removed

Example:

ID Cause S1 S2 S3
"a1" "HIV" 1 0 0
"b2" "Stroke" 0 0 1
"c3" "HIV" 1 1 0
test

Dataframe of verbal autopsy test data in the same format as train except if causes are not known:

  • The 2nd column (Cause) can be omitted if known is FALSE

known

TRUE to indicate that the test causes are available in the 2nd column and FALSE to indicate that they are not known

Value

out The result nbc list object containing:

References

See Also

Other main functions: plot.nbc(), print.nbc_summary(), summary.nbc()

Examples

library(nbc4va)
data(nbc4vaData)

# Run naive bayes classifier on random train and test data
# Set "known" to indicate whether or not "test" causes are known
train <- nbc4vaData[1:50, ]
test <- nbc4vaData[51:100, ]
results <- nbc(train, test, known=TRUE)

# Obtain the probabilities and predictions
prob <- results$prob.causes
pred <- results$pred.causes


[Package nbc4va version 1.2 Index]