multiROC {movieROC}R Documentation

Build a ROC curve for a multivariate marker with dimension p

Description

This is one of the main functions of the movieROC package. It builds a multivariate ROC curve by considering one of these methods: i) fitting a binary logistic regression model with a particular combination (fixed by the user) of the two components on the right-hand side, ii) linear combinations with fixed parameters, or iii) linear combinations with dynamic parameters, or iv) estimating optimal transformation based on kernel density estimation, or v) quadratic combinations with fixed parameters (if p=2). It returns a ‘multiroc’ object, a list of class ‘multiroc’. This object can be printed or plotted. It may be also passed to plot_buildROC() and movieROC() function.

Usage

multiROC(X, D, ...)
## Default S3 method:
multiROC(X, D, 
    method = c("lrm", "fixedLinear", "fixedQuadratic", "dynamicEmpirical", 
                "dynamicMeisner", "kernelOptimal"),
    formula.lrm = "D ~ X.1 + I(X.1^2) + X.2 + I(X.2^2) + I(X.1*X.2)",
    stepModel = TRUE, 
    methodLinear = c("coefLinear", "SuLiu", "PepeThompson", "logistic", "minmax"), 
    coefLinear = rep(1,ncol(X)), coefQuadratic = c(1,1,0,1,1), 
    K = 201, alpha = 0.5, approxh = 0.5, multiplier = 2, 
    kernelOptimal.H = c("Hbcv", "Hscv", "Hpi", "Hns", "Hlscv", "Hbcv.diag", 
                        "Hscv.diag", "Hpi.diag", "Hlscv.diag"), 
    eps = sqrt(.Machine$double.eps), verbose = FALSE, ...)

Arguments

X

Matrix (dimension n \times p) of marker values where n is the sample size and p is the dimension of the multivariate marker.

D

Vector of response values. Two levels; if more, the two first ones are used.

method

Method used to build the classification regions. One of "lrm" (fitting a binary logistic regression model by the input parameter formula), "fixedLinear" (linear frontiers with fixed parameters given in coefLinear or estimated by the method in methodLinear), "fixedQuadratic" (quadratic frontiers with fixed parameters given in coefQuadratic, only available for p=2), "dynamicMeisner" (linear frontiers with dynamic parameters reported by Meisner et al. (2021) method), "dynamicEmpirical" (linear frontiers with dynamic parameters reported by the empirical method, only available for p=2), or "kernelOptimal" (estimating optimal transformation based on bivariate kernel density estimation by Martínez-Camblor et al. (2021) using the kde() function in the ks package). Default: "lrm".

formula.lrm

If method = "lrm", the transformation employed in the right-hand side of the logistic regression model (in terms of X.1, X.2 dots, X.p, and D). Default: quadratic formula for the two first components X.1 and X.2.

stepModel

If TRUE and method = "lrm", a model selection is performed based on the AIC (Akaike information criterion) in a stepwise algorithm (see step() function in stats package for more information). Default: TRUE.

methodLinear

If method = "fixedLinear", method used to build the classification regions. One of "coefLinear" (particular fixed coefficients in coefLinear), "SuLiu" (Su and Liu, 1993), "PepeThompson" (Pepe and Thompson, 2000), "logistic" (logistic regression model), "minmax" (Liu et al., 2011). Default: "coefLinear".

coefLinear

If method = "fixedLinear" and methodLinear = "coefLinear", a vector of length p with the coefficients \beta_i (i \in \{1, \dots, p\}) used to \mathcal{L}_{\boldsymbol{\beta}}(\boldsymbol{X}) = \beta_1 X_1 + \dots + \beta_p X_p. Default: (1,\dots,1).

coefQuadratic

If method = "fixedQuadratic", a vector of length 5 with coefficients \beta_1, \dots, \beta_5 used to \mathcal{Q}_{\boldsymbol{\beta}}(\boldsymbol{X}) = \beta_1 X_1 + \beta_2 X_2 + \beta_3 X_1 X_2 + \beta_4 X_1^2 + \beta_5 X_2^2. Default: (1,1,0,1,1).

alpha, approxh, multiplier

If method = "dynamicMeisner", input parameters used in the maxTPR() function of the maxTPR package (internally integrated because the library is no longer available in CRAN). Default: alpha = 0.5, approxh = 0.5 and multiplier = 2.

K

If method = "dynamicEmpirical", the number of equally spaced \alpha \in (-1,1) studied. Default: 201.

kernelOptimal.H

If method = "kernelOptimal", the bandwidth matrix H used in the kde() function of the ks package. Default: "Hbcv" (biased cross-validation (BCV) bandwidth matrix selector for bivariate data) if p = 2, "Hpi" (plug-in bandwidth selector) if p > 2.

eps

Epsilon value to consider. Default: sqrt(.Machine$double.eps).

verbose

If TRUE, a progress bar is displayed for computationally intensive methods. Default: FALSE.

...

Other parameters to be passed. Not used.

Value

A list of class ‘multiroc’ with the following fields:

controls, cases

Marker values of negative and positive subjects, respectively.

levels

Levels of response values.

t

Vector of false-positive rates.

roc

Vector of values of the ROC curve for t.

auc

Area under the curve estimate.

Z

If method \neq "dynamicMeisner" and method \neq "dynamicEmpirical", resulting univariate marker values.

c

If method \neq "dynamicMeisner" and method \neq "dynamicEmpirical", vector of final marker thresholds resulting in (t, roc).

CoefTable

If method = "dynamicMeisner" or "dynamicEmpirical", a list of length equal to length of vector t. Each element of the list keeps the linear coefficients (coef), threshold for such linear combination (c), the corresponding point in the ROC curve (t, roc), the resulting univariate marker values (Z) and a matrix of dimension 100 \times 100 with the marker values over a grid of (X_1, X_2) bivariate values (f).

Dependencies

If method = "lrm", the glm() function in the stats package is used.

If method = "kernelOptimal", the kde() function in the ks package is used.

References

J. Q. Su and J. S. Liu. (1993) “Linear combinations of multiple diagnostic markers”. Journal of the American Statistical Association, 88(424): 1350–1355. DOI: doi:10.1080/01621459.1993.10476417.

M. S. Pepe and M. L. Thompson (2000) “Combining diagnostic test results to increase accuracy”. Biostatistics, 1 (2):123–140. DOI: doi:10.1093/biostatistics/1.2.123.

C. Liu, A. Liu, and S. Halabi (2011) “A min–max combination of biomarkers to improve diagnostic accuracy”. Statistics in Medicine, 30(16): 2005–2014. DOI: doi:10.1002/sim.4238.

P. Martínez-Camblor, S. Pérez-Fernández, and S. Díaz-Coto (2021) “Optimal classification scores based on multivariate marker transformations”. AStA Advances in Statistical Analysis, 105(4): 581–599. DOI: doi:10.1007/s10182-020-00388-z.

A. Meisner, M. Carone, M. S. Pepe, and K. F. Kerr (2021) “Combining biomarkers by maximizing the true positive rate for a fixed false positive rate”. Biometrical Journal, 63(6): 1223–1240. DOI: doi:10.1002/bimj.202000210.

Examples

data(HCC)

# ROC curve for genes 20202438 and 18384097 (p=2) to identify tumor by 4 different methods:
X <- cbind(HCC$cg20202438, HCC$cg18384097); D <- HCC$tumor
## 1. Linear combinations with fixed parameters by Pepe and Thompson (2000)
multiROC(X, D, method = "fixedLinear", methodLinear = "PepeThompson")
## 2.Linear combinations with dynamic parameters by Meisner et al. (2021)
### Time consuming
multiROC(X, D, method = "dynamicMeisner")
## 3. Logistic regression model with quadratic formula by default
multiROC(X, D)
## 4. Optimal transformation with multivariate KDE by Martínez-Camblor et al. (2021)
multiROC(X, D, method = "kernelOptimal")

# ROC curve for genes 20202438, 18384097, and 03515901 (p=3) to identify tumor
# by 4 different methods:
X <- cbind(HCC$cg20202438, HCC$cg18384097, HCC$cg03515901); D <- HCC$tumor
## 1. Linear combinations with fixed parameters by Pepe and Thompson (2000)
multiROC(X, D, method = "fixedLinear", methodLinear = "PepeThompson")
## 2.Linear combinations with dynamic parameters by Meisner et al. (2021)
### Time consuming
multiROC(X, D, method = "dynamicMeisner")
## 3. Logistic regression model with quadratic formula by default
multiROC(X, D)
## 4. Optimal transformation with multivariate KDE by Martínez-Camblor et al. (2021)
multiROC(X, D, method = "kernelOptimal")

[Package movieROC version 0.1.1 Index]