optimal.cutpoints {OptimalCutpoints} | R Documentation |
Computing Optimal Cutpoints in diagnostic tests
Description
optimal.cutpoints calculates optimal cutpoints in diagnostic tests. Several methods or criteria for selecting optimal cutoffs have been implemented, including methods based on cost-benefit analysis and diagnostic test accuracy measures (Sensitivity/Specificity, Predictive Values and Diagnostic Likelihood Ratios) or prevalence (Lopez-Raton et al. 2014).
Usage
optimal.cutpoints(X, ...)
## Default S3 method:
optimal.cutpoints(X, status, tag.healthy, methods, data, direction = c("<", ">"),
categorical.cov = NULL, pop.prev = NULL, control = control.cutpoints(),
ci.fit = FALSE, conf.level = 0.95, trace = FALSE, ...)
## S3 method for class 'formula'
optimal.cutpoints(X, tag.healthy, methods, data, direction = c("<", ">"),
categorical.cov = NULL, pop.prev = NULL, control = control.cutpoints(),
ci.fit = FALSE, conf.level = 0.95, trace = FALSE, ...)
Arguments
X |
either a character string with the name of the diagnostic test variable (then method 'optimal.cutpoints.default' is called), or a formula (then method 'optimal.cutpoints.formula' is called). When 'X' is a formula, it must be an object of class "formula". Right side of ~ must contain the name of the variable that distinguishes healthy from diseased individuals, and left side of ~ must contain the name of the diagnostic test variable. |
status |
a character string with the name of the variable that distinguishes healthy from diseased individuals. Only applies for the method 'optimal.cutpoints.default'). |
tag.healthy |
the value codifying healthy individuals in the |
methods |
a character vector selecting the method(s) to be used: "CB" (cost-benefit method); "MCT" (minimizes Misclassification Cost Term); "MinValueSp" (a minimum value set for Specificity); "MinValueSe" (a minimum value set for Sensitivity); "ValueSe" (a value set for Sensitivity); "MinValueSpSe" (a minimum value set for Specificity and Sensitivity); "MaxSp" (maximizes Specificity); "MaxSe" (maximizes Sensitivity);"MaxSpSe" (maximizes Sensitivity and Specificity simultaneously); "MaxProdSpSe" (maximizes the product of Sensitivity and Specificity or Accuracy Area); "ROC01" (minimizes distance between ROC plot and point (0,1)); "SpEqualSe" (Sensitivity = Specificity); "Youden" (Youden Index); "MaxEfficiency" (maximizes Efficiency or Accuracy, similar to minimize Error Rate); "Minimax" (minimizes the most frequent error); "MaxDOR" (maximizes Diagnostic Odds Ratio); "MaxKappa" (maximizes Kappa Index); "MinValueNPV" (a minimum value set for Negative Predictive Value); "MinValuePPV" (a minimum value set for Positive Predictive Value); "ValueNPV" (a value set for Negative Predictive Value);"ValuePPV" (a value set for Positive Predictive Value);"MinValueNPVPPV" (a minimum value set for Predictive Values); "PROC01" (minimizes distance between PROC plot and point (0,1)); "NPVEqualPPV" (Negative Predictive Value = Positive Predictive Value); "MaxNPVPPV" (maximizes Positive Predictive Value and Negative Predictive Value simultaneously); "MaxSumNPVPPV" (maximizes the sum of the Predictive Values); "MaxProdNPVPPV" (maximizes the product of Predictive Values); "ValueDLR.Negative" (a value set for Negative Diagnostic Likelihood Ratio); "ValueDLR.Positive" (a value set for Positive Diagnostic Likelihood Ratio); "MinPvalue" (minimizes p-value associated with the statistical Chi-squared test which measures the association between the marker and the binary result obtained on using the cutpoint); "ObservedPrev" (The closest value to observed prevalence); "MeanPrev" (The closest value to the mean of the diagnostic test values); or "PrevalenceMatching" (The value for which predicted prevalence is practically equal to observed prevalence). |
data |
a data frame containing all needed variables. |
direction |
character string specifying the direction to compute the ROC curve. By default individuals with a test value lower than the cutoff are classified as healthy (negative test), whereas patients with a test value greater than (or equal to) the cutoff are classified as diseased (positive test). If this is not the case, however, and the high values are related to health, this argument should be established at ">". |
categorical.cov |
a character string with the name of the categorical covariate according to which optimal cutpoints are to be calculated. The default is NULL (no categorical covariate). |
pop.prev |
the value of the disease's prevalence. The default is NULL (prevalence is estimated on the basis of sample prevalence). It can be a vector indicating the prevalence values for each categorical covariate level. |
control |
output of the |
ci.fit |
a logical value. If TRUE, inference is performed on the accuracy measures at the optimal cutpoint. The default is FALSE. |
conf.level |
a numerical value with the confidence level for the construction of the confidence intervals. The default value is 0.95. |
trace |
a logical value. If TRUE, information on progress is shown. The default is FALSE. |
... |
further arguments passed to or from other methods. None are used in this method. |
Details
Continuous biomarkers or diagnostic tests are often used to discriminate between diseased and healthy populations. In clinical practice, it is necessary to select a cutpoint or discrimination value c which defines positive and negative test results. Several methods for selecting optimal cutpoints in diagnostic tests have been proposed in the literature depending on the underlying reason for this choice. In this package, thirty-two criteria are available.
Before describing the methods in detail, mention should be made of the following notation: C_{FP}
, C_{TN}
, C_{FN}
and C_{TP}
are the costs of False Positive, True Negative, False Negative and True Positive decisions, respectively; p
is disease prevalence; Se
is Sensitivity; and Sp
is Specificity.
"CB"
: Criterion based on cost-benefit methodology by means of calculating the slope of the ROC curve at the optimal cutoff as
S=\frac{1-p}{p}CR=\frac{1-p}{p}\frac{C_{FP}-C_{TN}}{C_{FN}-C_{TP}}
(McNeill et al. 1975; Metz et al. 1975; Metz 1978). This method thus weighs the relative costs of the different predictions in the diagnosis. By default, the costs ratio is 1, and this is the costs.ratio
argument in the control.cutpoints
function.
"MCT"
: Criterion based on the minimization of the Misclassification Cost Term (MCT) defined as
MCT(c)=\frac{C_{FN}}{C_{FP}}p(1-Se(c))+(1-p)(1-Sp(c))
(Smith 1991; Greiner 1995,1996). By default, C_{FN}=C_{FP} =
1, and these are the CFN
and CFP
arguments in the control.cutpoints
function.
"MinValueSp"
: Criterion based on setting a minimum value for Specificity and maximizing Sensitivity, subject to this condition (Shaefer 1989; Vermont et al. 1991; Gallop et al. 2003). Hence, in a case where there is more than one cutpoint fulfilling this condition, those which yield maximum Sensitivity are chosen. If several cutpoints still remain, those yielding the greatest Specificity are chosen. By default, the minimum value for Specificity is 0.85, and this is the valueSp
argument in the control.cutpoints
function.
"MinValueSe"
: Criterion based on setting a minimum value for Sensitivity and maximizing Specificity, subject to this condition (Shaefer 1989; Vermont et al. 1991; Gallop et al. 2003). Hence, in a case where there is more than one cutpoint fulfilling this condition, those which yield maximum Specificity are chosen. If several cutpoints still remain, those yielding the greatest Sensitivity are chosen. By default, the minimum value for Sensitivity is 0.85, and this is the valueSe
argument in the control.cutpoints
function.
"ValueSp"
: Criterion based on setting a particular value for Specificity (Rutter and Miglioretti 2003). In a case where there is more than one cutpoint fulfilling this condition, those which yield maximum Sensitivity are chosen.
"ValueSe"
: Criterion based on setting a particular value for Sensitivity (Rutter and Miglioretti 2003). In a case where there is more than one cutpoint fulfilling this condition, those which yield maximum Specificity are chosen.
"MinValueSpSe"
: Criterion based on setting minimum values for Sensitivity and Specificity measures (Shaefer 1989). In a case where there is more than one cutpoint fulfilling these conditions, those which yield maximum Sensitivity or maximum Specificity are chosen. The user can select one of these two options by means of the maxSp
argument in the control.cutpoints
function. If TRUE (the default value), the cutpoint/s yielding maximum Specificity is/are computed. If there are still several cutpoints which maximize the chosen measure, those which also maximize the other measure are chosen.
"MaxSp"
: Criterion based on maximization of Specificity (Bortheiry et al. 1994; Hoffman et al. 2000; Alvarez-Garcia et al. 2003). If there is more than one cutpoint fulfilling this condition, those which yield maximum Sensitivity are chosen.
"MaxSe"
: Criterion based on maximization of Sensitivity (Filella et al. 1995; Hoffman et al. 2000; Alvarez-Garcia et al. 2003). If there is more than one cutpoint fulfilling this condition, those which yield maximum Specificity are chosen.
"MaxSpSe"
: Criterion based on simultaneously maximizing Sensitivity and Specificity (Riddle and Stratford 1999; Gallop et al. 2003).
"MaxProdSpSe"
: Criterion based on maximizing the product of Specificity and Sensitivity (Lewis et al. 2008).
This criterion is the same as the method based on maximization of the Accuracy Area (Greiner 1995, 1996) defined as
AA(c)=frac{TP(c)TN(c)}{(TP(c)+FN(c))(FP(c)+TN(c))}
where TP
, TN
, FN
and FP
are the number of True Positives, True Negatives, False Negatives and False Positives classifications, respectively.
"ROC01"
: Criterion of the point on the ROC curve closest to the point (0,1), i.e, upper left corner of the unit square (Metz 1978; Vermont et al. 1991).
"SpEqualSe"
: Criterion based on the equality of Sensitivity and Specificity (Greiner et al. 1995; Hosmer and Lemeshow 2000; Peng and So 2002). Since Specificity may not be exactly equal to Sensitivity, the absolute value of the difference between them is minimized.
"Youden"
: Criterion based on Youden's Index (Youden 1950; Aoki et al. 1997; Shapiro 1999; Greiner et al. 2000) defined as YI(c)=max_{c}(Se(c)+Sp(c)-1)
. This is identical (from an optimization point of view) to the method that maximizes the sum of Sensitivity and Specificity (Albert and Harris 1987; Zweig and Campbell 1993) and to the criterion that maximizes concordance, wich is a monotone function of the AUC, defined as
AUC(c)=\frac{Se(c)-(1-Sp(c))+1}{2}
(Begg et al. 2000; Gonen and Sima 2008).
Costs of misclassifications can be considered in this criterion and for using the Generalized Youden Index: GYI(c)=max_{c}(Se(c)+rSp(c)-1
(Geisser 1998; Greiner et al. 2000; Schisterman et al. 2005), where
r=\frac{1-p}{p}\frac{C_{FN}}{C_{FP}}
. If the generalized.Youden
argument in the control.cutpoints
function is TRUE, Generalized Youden Index is computed. The default is FALSE. The CFN
and CFP
arguments in the control.cutpoints
function indicate the cost values, and by default, C_{FN}=C_{FP}=
1.
Moreover, the optimal cutpoint based on Youden's Index can be computed by means of cost-benefit methodology (see "CB" method), with the slope of the ROC curve at the optimal cutoff being S=1
for the Youden Index and S=\frac{1-p}{p}\frac{C_{FN}}{C_{FP}}
for the Generalized Youden Index. If the costs.benefits.Youden
argument in the control.cutpoints
function is TRUE, the optimal cutpoint based on cost-benefit methodology is computed. By default, it is FALSE.
"MaxEfficiency"
: Criterion based on maximization of the Efficiency, Accuracy, Validity Index or percentage of cases correctly classified defined as Ef(c)=pSe(c)+(1-p)Sp(c)
(Feinstein 1975; Galen 1986; Greiner 1995, 1996).
This criterion is similar to the criterion based on minimization of the Misclassification Rate which measures the error in cases where diseased and disease-free patients are misdiagnosed (Metz 1978). It is defined as ER(c)= p(1-Se(c))+(1-p)(1-Sp(c))
.
Moreover, the optimal cutpoint based on this method can be computed by means of cost-benefit methodology (see "CB" method), with the slope of the ROC curve at the optimal cutoff being S=\frac{1-p}{p}
. If the costs.benefits.Efficiency
argument in the control.cutpoints
function is TRUE, the optimal cut-point based on cost-benefit methodology is computed. By default, it is FALSE.
"Minimax"
: Criterion based on minimization of the most frequent error (Hand 1987): min_{c}(max(p(1-Se(c)),(1-p)(1-Sp(c))))
. In a case where there is more than one cutpoint fulfilling this condition, those which yield maximum Sensitivity or maximum Specificity are chosen. The user can select one of these two options by means of the maxSp
argument in the control.cutpoints
function. If TRUE (the default value), the cutpoint/s yielding maximum Specificity is/are computed. If there are still several cutpoints which maximize the chosen measure, those which also maximize the other measure are chosen.
"MaxDOR"
: Criterion based on maximizating the Diagnostic Odds Ratio (DOR), defined as
DOR(c)=\frac{Se(c)}{(1-Se(c))}\frac{Sp(c)}{(1-Sp(c))}
(Kraemer 1992; Greiner et al. 2000; Boehning et al. 2011).
"MaxKappa"
: Criterion based on maximization of the Kappa Index (Cohen 1960; Greiner et al. 2000). Kappa makes full use of the information in the confusion matrix to assess the improvement over chance prediction. Costs of misclassifications can be considered in this criterion and for using the Weighted Kappa Index (Kraemer 1992; Kraemer et al. 2002) defined as
PK(c)=\frac{p(1-p)(Se(c)+Sp(c)-1)}{p(p(1-Se(c))+(1-p)Sp(c))r+(1-p)(pSe(c)+(1-p)(1-Sp(c)))(1-r)}
where
r=\frac{C_{FP}}{ C_{FP}+ C_{FN}}
. If the weighted.Kappa
argument in the control.cutpoints
function is TRUE, the Weighted Kappa Index is computed. The default value is FALSE. The CFN
and CFP
arguments in the control.cutpoints
function indicate the cost values, and by default, C_{FP}=C_{FN}
= 1.
"MinValueNPV"
: Criterion based on setting a minimum value for Negative Predictive Value (Vermont et al. 1991). In a case where there is more than one cutpoint fulfilling this condition, those which yield the maximum Positive Predictive Value are chosen. If several cutpoints still remain, those yielding the highest Negative Predictive Value are chosen. By default, the minimum value for Negative Predictive Value is 0.85 and this is the valueNPV
argument in the control.cutpoints()
function.
"MinValuePPV"
: Criterion based on setting a minimum value for Positive Predictive Value (Vermont et al. 1991). In a case where there is more than one cutpoint fulfilling this condition, those which yield the maximum Negative Predictive Value are chosen. If several cutpoints still remain, those yielding the highest Positive Predictive Value are chosen. By default, the minimum value for Positive Predictive Value is 0.85, and this is specified by the valuePPV
argument in the control.cutpoints()
function.
"ValueNPV"
: Criterion based on setting a particular value for Negative Predictive Value. In a case where there is more than one cutpoint fulfilling this condition, those which yield maximum Positive Predictive Value are chosen.
"ValuePPV"
: Criterion based on setting a particular value for Positive Predictive Value. In a case where there is more than one cutpoint fulfilling this condition, those which yield maximum Negative Predictive Value are chosen.
"MinValueNPVPPV"
: Criterion based on setting minimum values for Predictive Values (Vermont et al. 1991). In a case where there is more than one cutpoint fulfilling these conditions, those which yield the maximum Negative or maximum Positive Predictive Value are chosen. The user can select one of these two options by means of the maxNPV
argument in the control.cutpoints
function. If TRUE (the default value), the cutpoint/s yielding maximum Negative Predictive Value is/are computed. If there are still several cutpoints which maximize the chosen measure, those which also maximize the other measure are chosen.
"PROC01"
: Criterion of the point on the PROC curve closest to the point (0,1), i.e., upper left corner of the unit square (Vermont et al. 1991; Gallop et al. 2003).
"NPVEqualPPV"
: Criterion based on the equality of Predictive Values (Vermont et al. 1991). Since the Positive Predictive Value may not be exactly equal to the Negative Predictive Value, the absolute value of the difference between them is minimized.
"MaxNPVPPV"
: Criterion based on simultaneously maximizing Positive Predictive Value and Negative Predictive Value.
"MaxSumNPVPPV"
: Criterion based on maximizing the sum of Positive Predictive Value and Negative Predictive Value.
"MaxProdNPVPPV"
: Criterion based on maximizing the product of Positive Predictive Value and Negative Predictive Value.
"ValueDLR.Negative"
: Criterion based on setting a particular value for the Negative Diagnostic Likelihood Ratio (Boyko 1994; Rutter and Miglioretti 2003). The default value is 0.5, and it is specified by the valueDLR.Negative
argument in the control.cutpoints
function.
"ValueDLR.Positive"
: Criterion based on setting a particular value for the Positive Diagnostic Likelihood Ratio (Boyko 1994; Rutter and Miglioretti 2003). The default value is 2, and it is specified by the valueDLR.Positive
argument in the control.cutpoints
function.
"MinPvalue"
: Criterion based on the minimum p-value associated with the statistical Chi-squared test which measures the association between the marker and the binary result obtained on using the cutpoint (Miller and Siegmund 1982; Lausen and Schumacher 1992; Altman et al. 1994; Mazumdar and Glasman 2000).
"ObservedPrev"
: Criterion based on setting the closest value to observed prevalence, i.e., c/max_{c}{|c-p|}
, with p being prevalence estimated from the sample. This criterion is thus indicated/valid in cases where the diagnostic test takes values in the interval (0,1), and it is a useful method in cases where preserving prevalence is of prime importance (Manel et al. 2001).
"MeanPrev"
: Criterion based on setting the closest value to the mean of the diagnostic test values. This criterion is usually used in cases where the diagnostic test takes values in the interval (0,1), i.e., the mean probability of ocurrence, e.g., based on the results of a statistical model(Manel et al. 2001; Kelly et al. 2008).
"PrevalenceMatching"
: Criterion based on the equality of sample and predicted prevalence: pSe(c)+(1-p)(1-Sp(c))
where p
is the prevalence estimated from the sample (Manel et al. 2001; Kelly et al. 2008). This criterion is usually used in cases where the diagnostic test takes values in the interval (0,1), i.e., the predicted probability, e.g., based on a statistical model.
Value
Returns an object of class "optimal.cutpoints" with the following components:
methods |
a character vector with the value of the |
levels.cat |
a character vector indicating the levels of the categorical covariate if the |
call |
the matched call. |
data |
the data frame with the variables used in the call. |
For each of the methods used in the call, a list with the following components is obtained:
"measures.acc" |
a list with all possible cutoffs, their associated accuracy measures (Sensitivity, Specificity, Predictive Values, Diagnostic Likelihood Ratios and Area under ROC Curve, AUC), the prevalence and the sample size for both healthy and diseased populations. |
"optimal.cutoff" |
a list with the optimal cutoff(s) and its/their associated accuracy measures (Sensitivity, Specificity, Predictive Values, Diagnostic Likelihood Ratios and the number of False Positive and False Negative decisions). |
The following components only appear in some methods:
"criterion" |
the value of the method considered for selecting the optimal cutpoint for each cutoff. |
"optimal.criterion" |
the optimal value of the method considered for selecting the optimal cutpoint, i.e., the value of the criterion at the optimal cutpoint. |
Author(s)
Monica Lopez-Raton and Maria Xose Rodriguez-Alvarez
References
Albert, A. and Harris, E.K. (1987). Multivariate Interpretation of Clinical Laboratory Data. Marcel Dekker, New York, NY.
Altman, D.G., Lausen, B., Sauerbrei, W. and Schumacher, M. (1994). Dangers of using "optimal" cutpoints in the evaluation of prognostic factors. Journal of the National Cancer Institute 86 (11), 829–835.
Alvarez-Garcia, G. et al. (2003). Influence of age and purpose for testing on the cut-off selection of serological methods in bovine neosporosis. Veterinary Research 34, 341–352.
Aoki, K., Misumi, J., Kimura, T., Zhao, W. and Xie, T. (1997). Evaluation of cutoff levels for screening of gastric cancer using serum pepsinogens and distributions of levels of serum pepsinogens I, II and Of PG I/PG II ratios in a gastric cancer case-control study. Journal of Epidemiology 7, 143–151.
Begg, C.B., Cramer, L.D., Venkatraman, E.S. and Rosai, J. (2000). Comparing tumour staging and grading systems: a case study and a review of the issues, using thymoma as a model. Statistics in Medicine 19, 1997–2014.
Boehning, D., Holling, H. and Patilea, V. (2011). A limitation of the diagnostic-odds ratio in determining an optimal cut-off value for a continuous diagnostic test. Statistical Methods in Medical Research, 20(5), 541–550.
Bortheiry, A.L., Malerbi, D.A. and Franco, L.J. (1994). The ROC curve in the evaluation of fasting capillary blood glucose as a screening test for diabetes y IGT. Diabetes Care 17, 1269–1272.
Boyko, E.J. (1994). Ruling out or ruling in disease with the most sensitive or specific diagnostic test: short cut or wrong turn?. Medical Decision Making 14, 175–179.
Cohen, J. (1960). A coefficient of agreement for nominal scales. Educ Psychol Meas 20, 37–46.
Feinstein, S.H. (1975). The accuracy of diver sound localization by pointing. Undersea. Biomed.Res 2(3), 173–184.
Filella, X., Alcover, J., Molina, R. et al. (1995). Clinical usefulness of free PSA fraction as an indicator of prostate cancer. Int. J. Cancer 63, 780–784.
Galen, R.S. (1986). Use of predictive value theory in clinical immunology. In: N.R.Rose, H. Friedmann and J.L. Fahey (Eds.), Manual of Clinical Laboratory Immunology. American Society of Microbiology. Washington, DC, pp 966-970.
Gallop, R.J., Crits-Christoph, P., Muenz, L.R. and Tu, X.M. (2003). Determination and Interpretation of the Optimal Operating Point for ROC Curves Derived Through Generalized Linear Models. Understanding Statistics 2(4), 219–242.
Geisser, S. (1998). Comparing two tests used for diagnostic or screening processes. Statistics Probability Letters 40, 113–119.
Gonen, M. and Sima, C. (2008). Optimal cutpoint estimation with censored data. Memorial Sloan-Kettering Cancer Center Department of Epidemiology and Biostatistics Working Paper Series.
Greiner, M. (1995). Two-graph receiver operating characteristic (TG-ROC): a Microsoft-EXCEL template for the selection of cut-off values in diagnostic tests. Journal of Immunological Methods 185(1),145–146.
Greiner, M. (1996). Two-graph receiver operating characteristic (TG-ROC): update version supports optimisation of cut-off values that minimise overall misclassification costs. J. Immunol. Methods 191, 93–94.
Greiner, M., Pfeiffer, D. and Smith, R.D. (2000). Principals and practical application of the receiver operating characteristic analysis for diagnostic tests. Preventive Veterinary Medicine 45, 23–41.
Hand, D. (1987). Screening vs Prevalence Estimation. Applied Statistics 36, 1–7.
Hoffman, R.M., Clanon, D.L., Littenberg, B., Frank, J.J. and Peirce, J.C. (2000). Using the Free-to-total Prostate-specific Antigen Ratio to Detect Prostate Cancer in Men with Nonspecific Elevations of Prostate-specific Antigen Levels. J. Gen. Intern Med 15, 739–748.
Hosmer, D.W. and Lemeshow, S. (2000). Applied Logistic Regression. Wiley-Interscience, New York, USA.
Kelly, M.J., Dunstan, F.D., Lloyd, K. and Fone, D.L. (2008). Evaluating cutpoints for the MHI-5 and MCS using the GHQ-12: a comparison of five different methods. BMC Psychiatry 8, 10.
Kraemer, H.C. (1992). Risk ratios, odds ratio, and the test QROC. In: Evaluating medical tests. Newbury Park, CA: SAGE Publications, Inc.; pp 103–113.
Kraemer, H.C., Periyakoil, V.S. and Noda, A. (2002). Kappa coefficients in medical research. Statistics in Medicine 21, 2109–2129.
Lausen, B. and Schumacher, M. (1992). Maximally selected rank statistics. Biometrics 48, 73–85.
Lewis, J.D., Chuai, S., Nessel, L., Lichtenstein, G.R., Aberra, F.N. and Ellenberg, J.H. (2008). Use of the Noninvasive Components of the Mayo Score to Assess Clinical Response in Ulcerative Colitis. Inflamm Bowel Dis 14(12), 1660–1666.
Lopez-Raton, M., Rodriguez-Alvarez, M.X, Cadarso-Suarez, C. and Gude-Sampedro, F. (2014). OptimalCutpoints: An R Package for Selecting Optimal Cutpoints in Diagnostic Tests. Journal of Statistical Software 61(8), 1–36. doi: 10.18637/jss.v061.i08.
Manel, S., Williams, H. and Ormerod, S. (2001). Evaluating Presence-Absence Models in Ecology: the Need to Account for Prevalence. Journal of Applied Ecology 38, 921–931.
Mazumdar, M. and Glassman, J.R. (2000). Categorizing a prognostic variable: review of methods, code for easy implementation and applications to decision-making about cancer treatments. Statistics in Medicine 19, 113–132.
McNeill, B.J., Keeler, E. and Adelstein, S.J. (1975). Primer on certain elements of medical decision making, with comments on analysis ROC. N. Engl. J Med 293, 211–215.
Metz, C.E., Starr, S.J., Lusted, L.B. and Rossmann, K. (1975). Progress in evaluation of human observer visual detection performance using the ROC curve approach. In: Raynaud C, Todd-Pokropek AE eds. Information processing in scintigraphy. Orsay, France: CEA, 420–436.
Metz, CE. (1978). Basic principles of ROC analysis. Seminars Nucl. Med. 8, 283–298.
Miller, R. and Siegmund, D. (1982). Maximally selected chi square statistics. Biometrics 38, 1011–1016.
Navarro, J.B., Domenech, J.M., de la Osa, N. and Ezpeleta, L. (1998). El analisis de curvas ROC en estudios epidemiologicos de psicopatologia infantil: aplicacion al cuestionario CBCL. Anuario de Psicologia 29 (1), 3–15.
Peng, C.Y.J. and So, T.S.H. (2002). Logistic Regression Analysis and Reporting: A Primer. Understanding Statistics 1(1), 31–70.
Riddle, D.L. and Stratford, P.W. (1999). Interpreting validity indexes for diagnostic tests: an illustration using the Berg Balance Test. Physical Therapy 79, 939–950.
Rutter, C.M. and Miglioretti, D.L. (2003). Estimating the accuracy of psychological scales using longitudinal data. Biostatistics 4(1), 97–107.
Shaefer, H. (1989). Constructing a cut-off point for a quantitative diagnostic test. Statistics in Medicine 8, 1381–1391.
Schisterman, E.F., Perkins, N.J., Liu, A. and Bondell, H. (2005). Optimal cutpoint and its corresponding Youden index to discriminate individuals using pooled blood samples. Epidemiology 16, 73–81.
Shapiro, D.E. (1999). The interpretation of diagnostic tests. Statistical Methods in Medical Research 8, 113–134.
Smith, R.D. (1991). Evaluation of diagnostic tests. In: R.D. Smith (Ed.), Veterinary Clinical Epidemiology. Butter-worth-Heinemann. Stoneham, pp 29–43.
Vermont J, Bosson JL, Francois P, Robert C, Rueff A, Demongeot J. (1991). Strategies for graphical threshold determination. Computer Methods and Programs in Biomedicine 35, 141–150.
Youden, W.J. (1950). Index for rating diagnostic tests. Cancer 3, 32–35.
Zweig, M.H., Campbell, G. (1993). Receiver-operating characteristics (ROC) plots: a fundamental evaluation tool in clinical medicine. Clinical Chemistry 39, 561–577.
See Also
control.cutpoints
, summary.optimal.cutpoints
Examples
library(OptimalCutpoints)
data(elas)
####################
# marker: elas
# status: status
# categorical covariates:
# gender
####################
###########################################################
# Youden Index Method ("Youden"): Covariate gender
###########################################################
# Defaut method
optimal.cutpoint.Youden <- optimal.cutpoints(X = "elas", status = "status", tag.healthy = 0,
methods = "Youden", data = elas, pop.prev = NULL, categorical.cov = "gender",
control = control.cutpoints(), ci.fit = FALSE, conf.level = 0.95, trace = FALSE)
summary(optimal.cutpoint.Youden)
plot(optimal.cutpoint.Youden)
# Formula method
optimal.cutpoint.Youden <- optimal.cutpoints(X = elas ~ status, tag.healthy = 0,
methods = "Youden", data = elas, pop.prev = NULL, categorical.cov = "gender",
control = control.cutpoints(), ci.fit = FALSE, conf.level = 0.95, trace = FALSE)
# Inference on the test accuracy measures
optimal.cutpoint.Youden <- optimal.cutpoints(X = "elas", status = "status", tag.healthy = 0,
methods = "Youden", data = elas, pop.prev = NULL, categorical.cov = "gender",
control = control.cutpoints(), ci.fit = TRUE, conf.level = 0.95, trace = FALSE)
summary(optimal.cutpoint.Youden)
##########################################################################
# Sensitivity equal to Specificity Method ("SpEqualSe"): Covariate gender
##########################################################################
optimal.cutpoint.SpEqualSe <- optimal.cutpoints(X = "elas", status = "status", tag.healthy = 0,
methods = "SpEqualSe", data = elas, pop.prev = NULL, categorical.cov = "gender",
control = control.cutpoints(), ci.fit = TRUE, conf.level = 0.95, trace = FALSE)
summary(optimal.cutpoint.SpEqualSe)
plot(optimal.cutpoint.SpEqualSe)