thresholds {confcons} | R Documentation |
Thresholds needed to create the extended confusion matrix
Description
Calculate the two thresholds distinguishing certain negatives/positives from uncertain predictions. The thresholds are needed to create the extended confusion matrix and are further used for confidence calculation.
Usage
thresholds(observations, predictions = NULL, type = "mean", range = 0.5)
Arguments
observations |
Either an integer or logical vector containing the binary
observations where presences are encoded as |
predictions |
A numeric vector containing the predicted probabilities of
occurrence typically within the |
type |
A character vector of length one containing the value 'mean' (for calculating mean of the predictions within known presences and absences) or 'information' (for calculating thresholds based on relative information gain) . Defaults to 'mean'. |
range |
A numeric vector of length one containing a value from the
|
Value
A named numeric vector of length 2. The first element
('threshold1
') is the mean of probabilities predicted to the absence
locations distinguishing certain negatives (certain absences) from
uncertain predictions. The second element ('threshold2
') is the mean
of probabilities predicted to the presence locations distinguishing certain
positives (certain presences) from uncertain predictions. For a typical
model better than the random guess, the first element is smaller than the
second one. The returned value might contain NaN
(s) if the number of
observed presences and/or absences is 0.
Note
thresholds()
should be called using the whole dataset containing
both training and evaluation locations.
See Also
confidence
for calculating confidence,
consistency
for calculating consistency
Examples
set.seed(12345)
# Using logical observations:
observations_1000_logical <- c(rep(x = FALSE, times = 500),
rep(x = TRUE, times = 500))
predictions_1000 <- c(runif(n = 500, min = 0, max = 0.7),
runif(n = 500, min = 0.3, max = 1))
thresholds(observations = observations_1000_logical,
predictions = predictions_1000) # 0.370 0.650
# Using integer observations:
observations_4000_integer <- c(rep(x = 0L, times = 3000),
rep(x = 1L, times = 1000))
predictions_4000 <- c(runif(n = 3000, min = 0, max = 0.8),
runif(n = 1000, min = 0.2, max = 0.9))
thresholds(observations = observations_4000_integer,
predictions = predictions_4000) # 0.399 0.545
# Wrong parameterization:
try(thresholds(observations = observations_1000_logical,
predictions = predictions_4000)) # error
set.seed(12345)
observations_4000_numeric <- c(rep(x = 0, times = 3000),
rep(x = 1, times = 1000))
predictions_4000_strange <- c(runif(n = 3000, min = -0.3, max = 0.4),
runif(n = 1000, min = 0.6, max = 1.5))
try(thresholds(observations = observations_4000_numeric,
predictions = predictions_4000_strange)) # multiple warnings
mask_of_normal_predictions <- predictions_4000_strange >= 0 & predictions_4000_strange <= 1
thresholds(observations = as.integer(observations_4000_numeric)[mask_of_normal_predictions],
predictions = predictions_4000_strange[mask_of_normal_predictions]) # OK