precision {metrica} | R Documentation |
Precision | Positive Predictive Value
Description
precision
estimates the precision (a.k.a. positive predictive
value -ppv-) for a nominal/categorical predicted-observed dataset.
ppv
estimates the Positive Predictive Value (equivalent
to precision) for a nominal/categorical predicted-observed dataset.
FDR
estimates the complement of precision (a.k.a. positive predictive
value -PPV-) for a nominal/categorical predicted-observed dataset.
Usage
precision(
data = NULL,
obs,
pred,
tidy = FALSE,
atom = FALSE,
na.rm = TRUE,
pos_level = 2
)
ppv(
data = NULL,
obs,
pred,
tidy = FALSE,
atom = FALSE,
na.rm = TRUE,
pos_level = 2
)
FDR(
data = NULL,
obs,
pred,
atom = FALSE,
pos_level = 2,
tidy = FALSE,
na.rm = TRUE
)
Arguments
data |
(Optional) argument to call an existing data frame containing the data. |
obs |
Vector with observed values (character | factor). |
pred |
Vector with predicted values (character | factor). |
tidy |
Logical operator (TRUE/FALSE) to decide the type of return. TRUE returns a data.frame, FALSE returns a list; Default : FALSE. |
atom |
Logical operator (TRUE/FALSE) to decide if the estimate is made for each class (atom = TRUE) or at a global level (atom = FALSE); Default : FALSE. |
na.rm |
Logic argument to remove rows with missing values (NA). Default is na.rm = TRUE. |
pos_level |
Integer, for binary cases, indicating the order (1|2) of the level
corresponding to the positive. Generally, the positive level is the second (2)
since following an alpha-numeric order, the most common pairs are
|
Details
The precision is a non-normalized coefficient that represents the ratio between the correctly predicted cases (or true positive -TP- for binary cases) to the total predicted observations for a given class (or total predicted positive -PP- for binary cases) or at overall level.
For binomial cases, precision = \frac{TP}{PP} = \frac{TP}{TP + FP}
The precision
metric is bounded between 0 and 1. The closer to 1 the better.
Values towards zero indicate low precision of predictions. It can be estimated
for each particular class or at a global level.
The false detection rate or false discovery rate (FDR) represents the proportion of false positives with respect to the total number of cases predicted as positive.
For binomial cases, FDR = 1 - precision = \frac{FP}{PP} = \frac{FP}{TP + FP}
The precision
metric is bounded between 0 and 1. The closer to 1 the better.
Values towards zero indicate low precision of predictions.
For the formula and more details, see online-documentation
Value
an object of class numeric
within a list
(if tidy = FALSE) or within a
data frame
(if tidy = TRUE).
References
Ting K.M. (2017) Precision and Recall. In: Sammut C., Webb G.I. (eds) Encyclopedia of Machine Learning and Data Mining. Springer, Boston, MA. doi:10.1007/978-1-4899-7687-1_659
Examples
set.seed(123)
# Two-class
binomial_case <- data.frame(labels = sample(c("True","False"), 100,
replace = TRUE), predictions = sample(c("True","False"), 100, replace = TRUE))
# Multi-class
multinomial_case <- data.frame(labels = sample(c("Red","Blue", "Green"), 100,
replace = TRUE), predictions = sample(c("Red","Blue", "Green"), 100, replace = TRUE))
# Get precision estimate for two-class case
precision(data = binomial_case, obs = labels, pred = predictions, tidy = TRUE)
# Get FDR estimate for two-class case
FDR(data = binomial_case, obs = labels, pred = predictions, tidy = TRUE)
# Get precision estimate for each class for the multi-class case
precision(data = multinomial_case, obs = labels, pred = predictions, tidy = TRUE, atom = TRUE)
# Get precision estimate for the multi-class case at a global level
precision(data = multinomial_case, obs = labels, pred = predictions, tidy = TRUE, atom = TRUE)