get2by2 {predictMe}R Documentation

Return of five common results, based on the 2x2 cross-table (a.k.a. confusion matrix).

Description

Upon receiving two binary variables (only 0 and 1 permitted) of equal length, return sensitivity, specificity, positive predictive value, negative predictive value, and the base rate of the outcome.

Usage

get2by2(xr, measColumn = NULL, print2by2 = FALSE)

Arguments

xr

A data.frame with exactly two columns, one of the columns must be the binary measured outcome, the other column must be the binary predicted outcome, based on some algorithm's predictions (see Details).

measColumn

A single integer number that denotes which of the two columns of function argument 'x' contains the measured outcome.

print2by2

Logical value, defaults to FALSE. If set TRUE, two 2by2 matrices will be printed with explanations of what they display.

Details

The r in the argument 'xr' stands for response, meaning that the predicted probabilities must have been transformed to a binary outcome, usually by using the default cutoff of 0.5; although it may also be any other cutoff between 0 and 1.

If you wish to additionally print the 2x2 matrix, set the argument 'print2by2' TRUE (default: FALSE).

Value

a list with five elements (seven, if argument print2by2 is set TRUE; see Details):

  1. sens Sensitivity (a.k.a.: Recall, True Positive Rate).

  2. spec Specificity (a.k.a.: True Negative Rate).

  3. ppv Positive Predictive Value (a.k.a.: Precision).

  4. npv Negative Predictive Value.

  5. br Base rate of the outcome (mean outcome occurrence in the sample).

  6. tbl1 2x2 matrix. Test-theoretic perspective: Specificity in top left cell, sensitivity in bottom right cell.

  7. tbl2 2x2 matrix. Test-practical perspective (apply test in the real world): Negative predictive value (npv) in top left cell, positive predictive value (ppv) in bottom right cell.

Author(s)

Marcel Miché

Examples

# Simulate data set with binary outcome
dfBinary <- quickSim(type="binary")
# Logistic regression, used as algorithm to predict the response variable
# (response = estimated probability of outcome being present).
glmRes <- glm(y~x1+x2,data=dfBinary,family="binomial")
# Extract measured outcome and the predicted probability (fitted values)
# from the logistic regression output, put both in a data.frame.
glmDf <- data.frame(measOutcome=dfBinary$y,
                    fitted=glmRes$fitted.values)
# binary outcome, based on the default probability threshold of 0.5.
get2by2Df <- data.frame(
    measuredOutcome=glmDf$measOutcome,
    predictedOutcome=ifelse(glmDf$fitted<.5, 0, 1))
# Demand 2x2 matrix to be part of the resulting list.
my2x2 <- get2by2(xr=get2by2Df, measColumn=1, print2by2 = TRUE)
# Display both 2x2 matrices
# tbl1: Theoretical perspective, with specificity in top left cell,
# sensitivity in bottom right cell.
my2x2$tbl1
# tbl2: Practical perspective, with negative predictive value (npv)
# in top left cell, positive predictive value (ppv) in bottom right
# cell.
my2x2$tbl2

[Package predictMe version 0.1 Index]