R: Build a ROC curve for a transformation of a univariate marker

hROC {movieROC}

R Documentation

Build a ROC curve for a transformation of a univariate marker

Description

This is one of the main functions of the movieROC package. It builds a univariate ROC curve for a transformed marker h(X) and returns a ‘hroc’ object, a list of class ‘hroc’. This object can be printed, plotted, or predicted for a particular point. It may be also passed to plot_funregions() and plot_regions() functions.

Usage

hROC(X, D, ...)
## Default S3 method:
hROC(X, D, type = c("lrm", "overfitting", "kernel", "h.fun"), 
    formula.lrm = "D ~ pol(X,3)", h.fun = function(x) {x},  kernel.h = 1,
    plot.h = FALSE, plot.roc = FALSE, new.window = FALSE, 
    main = NULL, xlab = "x", ylab = "h(x)", xaxis = TRUE, ...)

Arguments

`X`	Vector of marker values.
`D`	Vector of response values. Two levels; if more, the two first ones are used.
`type`	Type of transformation considered. One of `"lrm"` (a binary logistic regression is computed by using `lrm()` function in rms package), `"kernel"` (the transformation included in Martínez-Camblor et al. (2021) estimated by the kernel density approach), `"overfitting"` (the overfitting transformation, `\hat{h}_{of}(\cdot)` is taken), or `"h.fun"` (the transformation indicated in the input parameter `h.fun` is considered). Default: `"lrm"`.
`formula.lrm`	If `type = "lrm"`, the transformation employed in the right-hand side of the logistic regression model (in terms of `X` and `D`). Default: `'D ~ pol(X, 3)'`.
`kernel.h`	If `type = "kernel"`, the bandwidth used for the kernel density estimation by using the `density()` function in stats package. Default: 1.
`h.fun`	If `type = "h.fun"`, the transformation employed (as a function in R). Default: `function(x){x}`.
`plot.h`	If TRUE, the transformation employed is illustrated.
`plot.roc`	If TRUE, the resulting ROC curve is illustrated.
`new.window`	If TRUE, two previous graphics are plotted separately in different windows.
`main`	A main title for the plot used if `plot.h = TRUE`.
`xlab`, `ylab`	A label for the x and y axis of the plot used if `plot.h = TRUE`.
`xaxis`	Graphical parameter used if `plot.h = TRUE`. If FALSE, plotting of the axis is supressed.
`...`	Other parameters to be passed. Not used.

Details

A theoretical and practical discussion about the type of transformation considered and its basis may be found in Martínez-Camblor et al. (2019) and Martínez-Camblor et al. (2021).

The overfitting function estimate is defined as follows:

\hat{h}_{of}(x) = \sum_{i=1}^{n_1} I(x = y_i) + \sum_{i=1}^{n_2} \dfrac{\#(\xi = z_i)}{\#(\xi = z_i) + \#(\chi = z_i)} I(x = z_i)

where I(A) denotes the indicator function (which takes the value 1 if A is true and 0 otherwise), \#(B) is the cardinal of the subset B, \{y_1, \dots, y_{n_1}\} \subseteq \left\{ \xi_1, \dots, \xi_n \right\} are the positive sample values without ties and \{z_1, \dots, z_{n_2}\} \subseteq \left\{ \xi_1, \dots, \xi_n \right\} are the positive sample values with ties with any negative sample value. Classification based on this transformation is the optimal one in the AUC sense, but the resulting decision rules cannot be extended to any other sample.

Value

A list of class ‘hroc’ with the following fields:

`levels`	Levels of response values.
`X`, `Y`	Original and transformed marker values, respectively.
`Sp`, `Se`	Vector of true-negtive and true-positive rates, respectively.
`auc`	Area under the curve estimate.
`model`	If `type = "lrm"`, the coefficients of the logistic regression model fitted by `formula`.

Dependencies

If type = "lrm", the lrm() function in the rms package is used. This library is also loaded to consider special transformation functions such as pol() and rcs().

References

P. Martínez-Camblor, S. Pérez-Fernández, and S. Díaz-Coto (2019) “Improving the biomarker diagnostic capacity via functional transformations”. Journal of Applied Statistics, 46(9): 1550–1566. DOI: doi:10.1080/02664763.2018.1554628.

P. Martínez-Camblor, S. Pérez-Fernández, and S. Díaz-Coto (2021) “Optimal classification scores based on multivariate marker transformations”. AStA Advances in Statistical Analysis, 105(4): 581–599. DOI: doi:10.1007/s10182-020-00388-z.

Examples

data(HCC)

# ROC curve for gene 18384097 to identify tumor by considering 4  different transformations:
X <- HCC$cg18384097; D <- HCC$tumor
## 1. Ordinary cubic polynomial formula for binary logistic regression
hROC(X, D)
## 2. Linear tail-restricted cubic splines for binary logistic regression
hROC(X, D, formula.lrm = "D ~ rcs(X,8)")
## 3. Overfitting transformation for this particular sample
hROC(X, D, type = "overfitting")
## 4. Optimal transformation in terms of likelihood ratio 
##    by kernel density estimation with bandwidth 3
hROC(X, D, type = "kernel", kernel.h = 3)

[Package movieROC version 0.1.1 Index]