performance {BuyseTest} | R Documentation |

## Assess Performance of a Classifier

### Description

Assess the performance in term of AUC and brier score of one or several binary classifiers. Currently limited to logistic regressions and random forest.

### Usage

```
performance(
object,
data = NULL,
newdata = NA,
individual.fit = FALSE,
impute = "none",
name.response = NULL,
fold.size = 1/10,
fold.repetition = 0,
fold.balance = FALSE,
null = c(brier = NA, AUC = 0.5),
conf.level = 0.95,
se = TRUE,
transformation = TRUE,
auc.type = "classical",
simplify = TRUE,
trace = TRUE,
seed = NULL
)
```

### Arguments

`object` |
a |

`data` |
[data.frame] the training data. |

`newdata` |
[data.frame] an external data used to assess the performance. |

`individual.fit` |
[logical] if |

`impute` |
[character] in presence of missing value in the regressors of the training dataset, should a complete case analysis be performed ( |

`name.response` |
[character] the name of the response variable (i.e. the one containing the categories). |

`fold.size` |
[double, >0] either the size of the test dataset (when >1) or the fraction of the dataset (when <1) to be used for testing when using cross-validation. |

`fold.repetition` |
[integer] when strictly positive, the number of folds used in the cross-validation. If 0 then no cross validation is performed. |

`fold.balance` |
[logical] should the outcome distribution in the folds of the cross-validation be similar to the one of the original dataset? |

`null` |
[numeric vector of length 2] the right-hand side of the null hypothesis relative to each metric. |

`conf.level` |
[numeric] confidence level for the confidence intervals. |

`se` |
[logical] should the uncertainty about AUC/brier be computed?
When |

`transformation` |
[logical] should the CI be computed on the logit scale / log scale for the net benefit / win ratio and backtransformed. Otherwise they are computed without any transformation. |

`auc.type` |
[character] should the auc be computed approximating the predicted probability by a dirac ( |

`simplify` |
[logical] should the number of fold and the size of the fold used for the cross validation be removed from the output? |

`trace` |
[logical] Should the execution of the function be traced. |

`seed` |
[integer, >0] Random number generator (RNG) state used when starting data spliting.
If |

### Value

An S3 object of class `performance`

.

### References

LeDell E, Petersen M, van der Laan M. Computationally efficient confidence intervals for cross-validated area under the ROC curve estimates. Electron J Stat. 2015;9(1):1583-1607. doi:10.1214/15-EJS1035

### Examples

```
## Simulate data
set.seed(10)
n <- 100
df.train <- data.frame(Y = rbinom(n, prob = 0.5, size = 1), X1 = rnorm(n), X2 = rnorm(n))
df.test <- data.frame(Y = rbinom(n, prob = 0.5, size = 1), X1 = rnorm(n), X2 = rnorm(n))
## fit logistic model
e.null <- glm(Y~1, data = df.train, family = binomial(link="logit"))
e.logit1 <- glm(Y~X1, data = df.train, family = binomial(link="logit"))
e.logit2 <- glm(Y~X1+X2, data = df.train, family = binomial(link="logit"))
## assess performance on the training set (biased)
## and external dataset
performance(e.logit1, newdata = df.test)
e.perf <- performance(list(null = e.null, p1 = e.logit1, p2 = e.logit2),
newdata = df.test)
e.perf
summary(e.perf, order.model = c("null","p2","p1"))
## assess performance using cross validation
## Not run:
set.seed(10)
performance(e.logit1, fold.repetition = 10, se = FALSE)
set.seed(10)
performance(list(null = e.null, prop = e.logit1), fold.repetition = 10)
performance(e.logit1, fold.repetition = c(50,20,10))
## End(Not run)
```

*BuyseTest*version 3.0.4 Index]