measures {confcons} | R Documentation |
Goodness-of-fit, confidence and consistency measures
Description
Wrapper function for calculating the predictive distribution model's
confidence
, consistency
, and optionally some
well-known goodness-of-fit measures as well. The calculated measures are as
follows:
confidence in predictions (CP) and confidence in positive predictions (CPP) within known presences for the training and evaluation subsets
consistency of predictions (difference of CPs; DCP) and positive predictions (difference of CPPs; DCPP)
Area Under the ROC Curve (AUC) - optional (see parameter
goodness
)maximum of the True Skill Statistic (maxTSS) - optional (see parameter
goodness
)
Usage
measures(
observations,
predictions,
evaluation_mask,
goodness = FALSE,
df = FALSE
)
Arguments
observations |
Either an integer or logical vector containing the binary
observations where presences are encoded as |
predictions |
A numeric vector containing the predicted probabilities of
occurrence typically within the |
evaluation_mask |
A logical vector (mask) of the evaluation subset. Its
|
goodness |
Logical vector of length one, defaults to |
df |
Logical vector of length one, defaults to |
Value
A named numeric vector (if df
is FALSE
; the default) or
a data.frame
(if df
is TRUE
) of one row.
length()
of the vector or ncol()
of the data.frame
is
6 (if goodness
is FALSE
; the default) or 8 (if
goodness
is TRUE
). The name of the elements/columns are as
follows:
- CP_train
confidence in predictions within known presences (CP) for the training subset
- CP_eval
confidence in predictions within known presences (CP) for the evaluation subset
- DCP
consistency of predictions (difference of CPs)
- CPP_train
confidence in positive predictions within known presences (CPP) for the training subset
- CPP_eval
confidence in positive predictions within known presences (CPP) for the evaluation subset
- DCPP
consistency of positive predictions (difference of CPPs)
- AUC
Area Under the ROC Curve (Hanley and McNeil 1982; calculated by
ROCR::performance()
). This element/column is available only if parameter 'goodness
' is set toTRUE
. If package ROCR is not available but parameter 'goodness
' is set toTRUE
, the value of AUC isNA_real_
and a warning is raised.- maxTSS
Maximum of the True Skill Statistic (Allouche et al. 2006; calculated by
ROCR::performance()
). This element/column is available only if parameter 'goodness
' is set toTRUE
. If package ROCR is not available but parameter 'goodness
' is set toTRUE
, the value of maxTSS isNA_real_
and a warning is raised.
Note
Since confcons is a light-weight, stand-alone packages, it does
not import package ROCR (Sing et al. 2005), i.e. installing
confcons does not mean installing ROCR automatically. If you
need AUC and maxTSS (i.e., parameter 'goodness
' is set to
TRUE
), you should install ROCR or install confcons along
with its dependencies (i.e., devtools::install_github(repo =
"bfakos/confcons", dependencies = TRUE)
).
References
Allouche O, Tsoar A, Kadmon R (2006): Assessing the accuracy of species distribution models: prevalence, kappa and the true skill statistic (TSS). Journal of Applied Ecology 43(6): 1223-1232. doi:10.1111/j.1365-2664.2006.01214.x.
Hanley JA, McNeil BJ (1982): The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143(1): 29-36. doi:10.1148/radiology.143.1.7063747.
Sing T, Sander O, Beerenwinkel N, Lengauer T. (2005): ROCR: visualizing classifier performance in R. Bioinformatics 21(20): 3940-3941. doi:10.1093/bioinformatics/bti623.
See Also
confidence
for calculating confidence,
consistency
for calculating consistency,
ROCR::performance()
for calculating AUC and
TSS
Examples
set.seed(12345)
dataset <- data.frame(
observations = c(rep(x = FALSE, times = 500),
rep(x = TRUE, times = 500)),
predictions_model1 = c(runif(n = 250, min = 0, max = 0.6),
runif(n = 250, min = 0.1, max = 0.7),
runif(n = 250, min = 0.4, max = 1),
runif(n = 250, min = 0.3, max = 0.9)),
predictions_model2 = c(runif(n = 250, min = 0.1, max = 0.55),
runif(n = 250, min = 0.15, max = 0.6),
runif(n = 250, min = 0.3, max = 0.9),
runif(n = 250, min = 0.25, max = 0.8)),
evaluation_mask = c(rep(x = FALSE, times = 250),
rep(x = TRUE, times = 250),
rep(x = FALSE, times = 250),
rep(x = TRUE, times = 250))
)
# Default parameterization, return a vector without AUC and maxTSS:
conf_and_cons <- measures(observations = dataset$observations,
predictions = dataset$predictions_model1,
evaluation_mask = dataset$evaluation_mask)
print(conf_and_cons)
names(conf_and_cons)
conf_and_cons[c("CPP_eval", "DCPP")]
# Calculate AUC and maxTSS as well if package ROCR is installed:
if (requireNamespace(package = "ROCR", quietly = TRUE)) {
conf_and_cons_and_goodness <- measures(observations = dataset$observations,
predictions = dataset$predictions_model1,
evaluation_mask = dataset$evaluation_mask,
goodness = TRUE)
}
# Calculate the measures for multiple models in a for loop:
model_IDs <- as.character(1:2)
for (model_ID in model_IDs) {
column_name <- paste0("predictions_model", model_ID)
conf_and_cons <- measures(observations = dataset$observations,
predictions = dataset[, column_name, drop = TRUE],
evaluation_mask = dataset$evaluation_mask,
df = TRUE)
if (model_ID == model_IDs[1]) {
conf_and_cons_df <- conf_and_cons
} else {
conf_and_cons_df <- rbind(conf_and_cons_df, conf_and_cons)
}
}
conf_and_cons_df
# Calculate the measures for multiple models in a lapply():
conf_and_cons_list <- lapply(X = model_IDs,
FUN = function(model_ID) {
column_name <- paste0("predictions_model", model_ID)
measures(observations = dataset$observations,
predictions = dataset[, column_name, drop = TRUE],
evaluation_mask = dataset$evaluation_mask,
df = TRUE)
})
conf_and_cons_df <- do.call(what = rbind,
args = conf_and_cons_list)
conf_and_cons_df