predictMe {predictMe} | R Documentation |
Documentation of this predictMe package.
Description
This package enables researchers to visualize the prediction performance of an algorithm, either on the individual level or approximating this level. The visualized result is instantly comprehensible, only depending on being familiar with the concept of 'difference' (yes or no) and the related concept of 'distance' (if difference yes, how large is it). The predictMe package can be applied to the output of any algorithm, given that the measured (and therefore also the predicted) outcome is either continuous or binary.
Importantly, predictMe only takes the two relevant columns, that is, the measured outcome and the predicted outcome. The values in the two columns will be transformed, to range between 0 and 100 (see Details in the documentation of functions binContinuous
or binBinary
), finally returning the transformed values as bins. The user can decide how small the bins shall be, using the function argument binWidth
. The smaller the bins, the more bins will be produced, which means the more will the visualized prediction performance approximate the individual level (see function makeTablePlot
). Differences between measured and predicted outcome on the individual level can also be visualized (see function makeDiffPlot
).
The predictMe package provides the transformed data (see functions binContinuous
or binBinary
) and the visualization (see functions functions makeTablePlot
or makeDiffPlot
). Nevertheless, the user is free to experiment with visualizing the results, which are returned in different formats (see vignette of predictMe for a few examples of how the data may be visualized).
The predictMe package depends on two packages: ggplot2 (Wickham, 2016) for providing suggested visualizations, and reshape (Wickham, 2007) for providing the results in a format that is readily compatible with ggplot2 experimentation. The conventional format may also be used, which is compatible with base R plotting functions.
Importantly, the predictMe package was developed with the aim of extreme ease of both, use and comprehension of the output. This, I hope, may make this package powerful, in terms of being actually used. The first four out of the six references (see below) contain bits of the intended usefulness of this package (see Note below). The actual idea for this package came while trying to achieve something specific, using the ggplot2 package (Wickham, 2016).
Note
These are the bits in the first four references below, that pertain to the intended usefulness of the predictMe package:
Altman and Royston (2000) provide this introductory quote (by Alvan Feinstein): 'Validation is one of those words ... that is constantly used and seldom defined.' This surely is strange in the vicinity of developing prognostic models, especially in the machine learning age, unless the statement was meant as a joke (which appears not to be the case), or is no longer valid in 2022 (which might be true or false, who knows).
Bickel and Lehman (2012):
If two different people, who both provided the exact same relevant input data for an algorithm, with which a risk percentage of some adverse outcome is computed, say complications due to an operation, they will receive the exact same risk estimation, e.g., 1 percent. However, both individuals may understand this number very differently, depending on their individual inclinations in general and/or at that moment. Therefore, one of the two individuals may simply say ok to the operation, while the other individual may ask for more detailed information. This more detailed information can be computed with the predictMe functions binContinuous
or binBinary
, and visualized with the predictMe function makeDiffPlot
. The differences can be colorized with the function makeDiffPlotColor
, which may help in seeing how far away an individual's prediction is from being perfect (no difference between measured and predicted outcome). Even though perfect prediction is practically utopian, it still might be relevant to the individual whether his or her predictions are closer to this utopian reference, compared to the predictions of all individuals, who have been used to develop the model that underlies this algorithm's individual predictions.
Assel et al. (2017): In line with Altman and Royston (2000), Assel et al. (2017) recommend to clarify whether a published prediction model is at an early stage of development or whether it approaches an advanced stage, maybe even suggesting implementation in the real world. In the latter case, much stricter performance criteria must be met, compared to the former case (early stage of model development), due to actual individuals of the real world being the supposed beneficiaries of the algorithmic decision support.
Offord and Kraemer (2000): In line with Altman and Royston (2000), Offord and Kraemer (2000) emphasize that a risk factor must in any case demonstrate that it can accurately split a group into individuals with low risk and individuals with high risk. In the real world, this requires much more than meeting statistical significance criteria or meeting other (similarly thin) model fit criteria. Again, if model development was at an early stage (see Assel et al., 2017), such criteria may suffice. However, at later stages, real world criteria must be met, that is, real-world relevant results must either replace or at least complement the commonly reported results of prediction performance.
Conclusion: The predictMe package provides the opportunity to provide some real-world relevant 'results', if visualized individual prediction performance may be considered as 'results'.
References
Altman DG, Royston P (2000). “What do we mean by validating a prognostic model?” Statistics in medicine, 19(4), 453–473.
Assel M, Sjoberg DD, Vickers AJ (2017). “The Brier score does not evaluate the clinical utility of diagnostic tests or prediction models.” Diagnostic and prognostic research, 1(1), 1–7.
Bickel PJ, Lehmann EL (2012). “Frequentist interpretation of probability.” In Selected Works of EL Lehmann, 1083–1085. Springer.
Offord DR, Kraemer HC (2000). “Risk factors and prevention.” Evidence-Based Mental Health, 3(3), 70–71.
Wickham H (2016). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. ISBN 978-3-319-24277-4, https://ggplot2.tidyverse.org.
Wickham H (2007). “Reshaping Data with the reshape Package.” Journal of Statistical Software, 21(12), 1–20. https://www.jstatsoft.org/v21/i12/.