R: Build a ROC curve for a multivariate marker with dimension p

multiROC {movieROC}

R Documentation

Build a ROC curve for a multivariate marker with dimension `p`

Description

This is one of the main functions of the movieROC package. It builds a multivariate ROC curve by considering one of these methods: i) fitting a binary logistic regression model with a particular combination (fixed by the user) of the two components on the right-hand side, ii) linear combinations with fixed parameters, or iii) linear combinations with dynamic parameters, or iv) estimating optimal transformation based on kernel density estimation, or v) quadratic combinations with fixed parameters (if p=2). It returns a ‘multiroc’ object, a list of class ‘multiroc’. This object can be printed or plotted. It may be also passed to plot_buildROC() and movieROC() function.

Usage

multiROC(X, D, ...)
## Default S3 method:
multiROC(X, D, 
    method = c("lrm", "fixedLinear", "fixedQuadratic", "dynamicEmpirical", 
                "dynamicMeisner", "kernelOptimal"),
    formula.lrm = "D ~ X.1 + I(X.1^2) + X.2 + I(X.2^2) + I(X.1*X.2)",
    stepModel = TRUE, 
    methodLinear = c("coefLinear", "SuLiu", "PepeThompson", "logistic", "minmax"), 
    coefLinear = rep(1,ncol(X)), coefQuadratic = c(1,1,0,1,1), 
    K = 201, alpha = 0.5, approxh = 0.5, multiplier = 2, 
    kernelOptimal.H = c("Hbcv", "Hscv", "Hpi", "Hns", "Hlscv", "Hbcv.diag", 
                        "Hscv.diag", "Hpi.diag", "Hlscv.diag"), 
    eps = sqrt(.Machine$double.eps), verbose = FALSE, ...)

Arguments

`X`	Matrix (dimension `n \times p`) of marker values where `n` is the sample size and `p` is the dimension of the multivariate marker.
`D`	Vector of response values. Two levels; if more, the two first ones are used.
`method`	Method used to build the classification regions. One of `"lrm"` (fitting a binary logistic regression model by the input parameter `formula`), `"fixedLinear"` (linear frontiers with fixed parameters given in `coefLinear` or estimated by the method in `methodLinear`), `"fixedQuadratic"` (quadratic frontiers with fixed parameters given in `coefQuadratic`, only available for `p=2`), `"dynamicMeisner"` (linear frontiers with dynamic parameters reported by Meisner et al. (2021) method), `"dynamicEmpirical"` (linear frontiers with dynamic parameters reported by the empirical method, only available for `p=2`), or `"kernelOptimal"` (estimating optimal transformation based on bivariate kernel density estimation by Martínez-Camblor et al. (2021) using the `kde()` function in the ks package). Default: `"lrm"`.
`formula.lrm`	If `method = "lrm"`, the transformation employed in the right-hand side of the logistic regression model (in terms of `X.1`, `X.2` `dots`, `X.p`, and `D`). Default: quadratic formula for the two first components `X.1` and `X.2`.
`stepModel`	If TRUE and `method = "lrm"`, a model selection is performed based on the AIC (Akaike information criterion) in a stepwise algorithm (see `step()` function in stats package for more information). Default: TRUE.
`methodLinear`	If `method = "fixedLinear"`, method used to build the classification regions. One of `"coefLinear"` (particular fixed coefficients in `coefLinear`), `"SuLiu"` (Su and Liu, 1993), `"PepeThompson"` (Pepe and Thompson, 2000), `"logistic"` (logistic regression model), `"minmax"` (Liu et al., 2011). Default: `"coefLinear"`.
`coefLinear`	If `method = "fixedLinear"` and `methodLinear = "coefLinear"`, a vector of length `p` with the coefficients `\beta_i` (`i \in \{1, \dots, p\}`) used to `\mathcal{L}_{\boldsymbol{\beta}}(\boldsymbol{X}) = \beta_1 X_1 + \dots + \beta_p X_p`. Default: `(1,\dots,1)`.
`coefQuadratic`	If `method = "fixedQuadratic"`, a vector of length 5 with coefficients `\beta_1, \dots, \beta_5` used to `\mathcal{Q}_{\boldsymbol{\beta}}(\boldsymbol{X}) = \beta_1 X_1 + \beta_2 X_2 + \beta_3 X_1 X_2 + \beta_4 X_1^2 + \beta_5 X_2^2`. Default: `(1,1,0,1,1)`.
`alpha`, `approxh`, `multiplier`	If `method = "dynamicMeisner"`, input parameters used in the `maxTPR()` function of the maxTPR package (internally integrated because the library is no longer available in CRAN). Default: `alpha = 0.5`, `approxh = 0.5` and `multiplier = 2`.
`K`	If `method = "dynamicEmpirical"`, the number of equally spaced `\alpha \in (-1,1)` studied. Default: 201.
`kernelOptimal.H`	If `method = "kernelOptimal"`, the bandwidth matrix `H` used in the `kde()` function of the ks package. Default: `"Hbcv"` (biased cross-validation (BCV) bandwidth matrix selector for bivariate data) if `p = 2`, `"Hpi"` (plug-in bandwidth selector) if `p > 2`.
`eps`	Epsilon value to consider. Default: `sqrt(.Machine$double.eps)`.
`verbose`	If TRUE, a progress bar is displayed for computationally intensive methods. Default: FALSE.
`...`	Other parameters to be passed. Not used.

Value

A list of class ‘multiroc’ with the following fields:

`controls`, `cases`	Marker values of negative and positive subjects, respectively.
`levels`	Levels of response values.
`t`	Vector of false-positive rates.
`roc`	Vector of values of the ROC curve for `t`.
`auc`	Area under the curve estimate.
`Z`	If `method` `\neq` `"dynamicMeisner"` and `method` `\neq` `"dynamicEmpirical"`, resulting univariate marker values.
`c`	If `method` `\neq` `"dynamicMeisner"` and `method` `\neq` `"dynamicEmpirical"`, vector of final marker thresholds resulting in (`t`, `roc`).
`CoefTable`	If `method = "dynamicMeisner"` or `"dynamicEmpirical"`, a list of length equal to length of vector `t`. Each element of the list keeps the linear coefficients (`coef`), threshold for such linear combination (`c`), the corresponding point in the ROC curve (`t`, `roc`), the resulting univariate marker values (`Z`) and a matrix of dimension 100 `\times` 100 with the marker values over a grid of (`X_1`, `X_2`) bivariate values (`f`).

Dependencies

If method = "lrm", the glm() function in the stats package is used.

If method = "kernelOptimal", the kde() function in the ks package is used.

References

J. Q. Su and J. S. Liu. (1993) “Linear combinations of multiple diagnostic markers”. Journal of the American Statistical Association, 88(424): 1350–1355. DOI: doi:10.1080/01621459.1993.10476417.

M. S. Pepe and M. L. Thompson (2000) “Combining diagnostic test results to increase accuracy”. Biostatistics, 1 (2):123–140. DOI: doi:10.1093/biostatistics/1.2.123.

C. Liu, A. Liu, and S. Halabi (2011) “A min–max combination of biomarkers to improve diagnostic accuracy”. Statistics in Medicine, 30(16): 2005–2014. DOI: doi:10.1002/sim.4238.

P. Martínez-Camblor, S. Pérez-Fernández, and S. Díaz-Coto (2021) “Optimal classification scores based on multivariate marker transformations”. AStA Advances in Statistical Analysis, 105(4): 581–599. DOI: doi:10.1007/s10182-020-00388-z.

A. Meisner, M. Carone, M. S. Pepe, and K. F. Kerr (2021) “Combining biomarkers by maximizing the true positive rate for a fixed false positive rate”. Biometrical Journal, 63(6): 1223–1240. DOI: doi:10.1002/bimj.202000210.

Examples

data(HCC)

# ROC curve for genes 20202438 and 18384097 (p=2) to identify tumor by 4 different methods:
X <- cbind(HCC$cg20202438, HCC$cg18384097); D <- HCC$tumor
## 1. Linear combinations with fixed parameters by Pepe and Thompson (2000)
multiROC(X, D, method = "fixedLinear", methodLinear = "PepeThompson")
## 2.Linear combinations with dynamic parameters by Meisner et al. (2021)
### Time consuming
multiROC(X, D, method = "dynamicMeisner")
## 3. Logistic regression model with quadratic formula by default
multiROC(X, D)
## 4. Optimal transformation with multivariate KDE by Martínez-Camblor et al. (2021)
multiROC(X, D, method = "kernelOptimal")

# ROC curve for genes 20202438, 18384097, and 03515901 (p=3) to identify tumor
# by 4 different methods:
X <- cbind(HCC$cg20202438, HCC$cg18384097, HCC$cg03515901); D <- HCC$tumor
## 1. Linear combinations with fixed parameters by Pepe and Thompson (2000)
multiROC(X, D, method = "fixedLinear", methodLinear = "PepeThompson")
## 2.Linear combinations with dynamic parameters by Meisner et al. (2021)
### Time consuming
multiROC(X, D, method = "dynamicMeisner")
## 3. Logistic regression model with quadratic formula by default
multiROC(X, D)
## 4. Optimal transformation with multivariate KDE by Martínez-Camblor et al. (2021)
multiROC(X, D, method = "kernelOptimal")

[Package movieROC version 0.1.1 Index]