compareROCdep {nsROC} | R Documentation |
Comparison of k paired ROC curves
Description
This function compares k ROC curves from dependent data. Different statistics can be considered in order to perform the comparison: those ones included in Martinez-Camblor et al. (2013) based on general distances between functions, the Venkatraman et al. (1996) methodology for comparing diagnostic the accuracy of the k markers based on data from a paired design and the DeLong et al. (1988) one based on the AUC (area under the curve) comparison. Two different methods could be considered to approximate the distribution function of the statistic: the procedure proposed by Venkatraman et al. (1996) (based on permutated samples) or the one introduced by Martinez-Camblor et al. (2012) (based on bootstrap samples). See References below.
Usage
compareROCdep(X, D, ...)
## Default S3 method:
compareROCdep(X, D, method=c("general.bootstrap","permutation","auc"),
statistic=c("KS","L1","L2","CR","VK","other"),
FUN.dist=function(g){max(abs(g))}, side=c("right","left"),
Ni=1000, B=500, perm=500, seed=123, h.fun=function(H,x){
H*sd(x)*length(x)^{-1/3}}, H=1, plot.roc=TRUE, type='s', lwd=3,
lwd.curves=rep(2,ncol(X)), lty=1, lty.curves=rep(1,ncol(X)),
col='black',col.curves=rainbow(ncol(X)), cex.lab=1.2,
legend=c(sapply(1:ncol(X), function(i){eval(bquote(expression(
hat(R)[.(i)](t))))}), expression(hat(R)(t))),
legend.position='bottomright', legend.inset=0.03,
cex.legend=1, ...)
Arguments
X |
a matrix of k columns in which each column is the vector of (bio)marker values corresponding to each sample. |
D |
the vector of response values. |
method |
the method used to approximate the statistic distribution. One of "general.bootstrap" (Martinez-Camblor et al. (2012)), "permutation" (Venkatraman et al. (1996)) or "auc" (DeLong et al. (1988)). |
statistic |
the statistic used to compare the curves. One of "KS" (Kolmogorov-Smirnov criteria), "L1" ( |
FUN.dist |
the distance considered as a function of one variable. If |
side |
type of ROC curve. One of "right" or "left". If |
Ni |
number of subintervals of the unit interval (FPR values) considered to calculate the curve. Default: 1000. |
B |
number of bootstrap samples if |
perm |
number of permutations if |
seed |
seed considered to generate the permutations (for reproducibility). Default: 123. |
h.fun |
a function defining the bandwidth calculus used to generate the bootstrap samples if |
H |
the value used to compute |
plot.roc |
if TRUE, a plot including ROC curve estimates for the k samples and the mean of all of them is displayed. |
type |
what type of plot should be drawn. |
lwd |
the line width to be used for mean ROC curve estimate. |
lwd.curves |
a vector with the line widths to be used for ROC curve estimates of each sample. |
lty |
the line type to be used for mean ROC curve estimate. |
lty.curves |
a vector with the line types to be used for ROC curve estimates of each sample. |
col |
the color to be used for mean ROC curve estimate. |
col.curves |
a vector with the colors to be used for ROC curve estimates of each sample. |
cex.lab |
the magnification to be used for x and y labels relative to the current setting of |
legend |
a character or expression vector to appear in the legend. |
legend.position , legend.inset , cex.legend |
the position of the legend, the inset distance from the margins as a fraction of the plot region when legend is placed and the character expansion factor relative to current |
... |
another graphical parameters to be passed. |
Details
First of all, the data introduced is checked and those subjects with some missing information (marker or response value(s)) are removed. Data from a paired design should have the same length along the samples. If this is not fulfilled the code will not run and an error will be showed.
If the Venkatraman statistic is chosen in order to compare left-sided ROC curves, an error will be displayed and it will not work. The Venkatraman methodology is just implemented for right-sided ROC curves. Furthermore, for this statistics, method="permutation"
is automatically assigned.
The statistic is defined by \sum_{i=1}^k
FUN.dist
(\sqrt{n_1} \cdot (\hat{R}_i(t) - \hat{R}(t))
) where FUN.dist
stands by the distance function, n_1
is the number of cases, \hat{R}_i(t)
is the ROC curve estimate from the i-th sample and \hat{R}(t) := k^{-1} \sum_{i=1}^k \hat{R}_i(t)
.
The statistics implemented are defined by the following FUN.dist
functions:
statistic="KS"
:FUN.dist(g) = max(abs(g))
statistic="L1"
:FUN.dist(g) = mean(abs(g))
statistic="L2"
:FUN.dist(g) = mean(g^2)
statistic="CR"
:FUN.dist.CR(g,h) = sum(g[-length(g)]^2*(h[-1]-h[-length(h)]))
Cramer von-Mises statistic is defined by
\sum_{i=1}^k
FUN.dist.CR
(\sqrt{n_1} \cdot (\hat{R}_i(t) - \hat{R}(t))
,\hat{R}(t)
)
In case of statistic="VK"
the Venkatraman methodology (see References below) is computed to calculate the statistic. If k>2
the statistic value is the sum of statistic values of each pair such that i < j
.
If method="general.bootstrap"
it is necessary to have a bandwidth in order to compute the bootstrap samples from the smoothed (the gaussian kernel is considered) multivariate empirical distribution functions referred to controls and cases. This bandwidth is defined by the h.FUN
function whose parameters are a bandwidth constant parameter defined by the user, H
, and the sample (cases or controls values of the marker) considered, x
.
If method="auc"
, the methodology proposed by DeLong et al. is implemented. This option is slower because of the Mann-Whitney statistic inside requires number~of~cases \cdot number~of~controls
comparisons. In this case, statistic
returns the value of the Mann-Whitney statistic estimate and test.statistic
the final test statistic estimate (formula (5) in the paper) which follows a chi-square distribution.
Value
n.controls |
the number of controls. |
n.cases |
the number of cases. |
controls.k |
a matrix whose columns are the controls along the k samples. |
cases.k |
a matrix whose columns are the cases along the k samples. |
statistic |
the value of the test statistic. |
stat.boot |
a vector of statistic values for bootstrap replicates if |
stat.perm |
a vector of statistic values for permutations if |
test.statistic |
statistic estimate given in formula (5) of DeLong et al. (1988) (See References below) if |
p.value |
the p-value for the test. |
References
Venkatraman E.S., Begg C.B., 1996, A distribution-free procedure for comparing receiver operating characteristic curves from a paired experiment, Biometrika, 83(4), 835-848.
Martinez-Camblor P., Corral, N., 2012, A general bootstrap algorithm for hypothesis testing, Journal of Statistical Planning and Inference, 142, 589-600.
Martinez-Camblor P., Carleos C., Corral N., 2013, General nonparametric ROC curve comparison, Journal of the Korean Statistical Society, 42(1), 71-81.
DeLong E.R., DeLong D.M., Clarke-Pearson D.L., 1988, Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach, Biometrics, 44, 837-845.
Examples
n0 <- 45; n1 <- 60
set.seed(123)
D <- c(rep(0,n0), rep(1,n1))
library(mvtnorm)
rho.12 <- 1/4; rho.13 <- 1/4; rho.23 <- 0.5
sd.controls <- c(1,1,1)
sd.cases <- c(1,1,1)
var.controls <- sd.controls%*%t(sd.controls)
var.cases <- sd.cases%*%t(sd.cases)
sigma.controls <- var.controls*matrix(c(1,rho.12,rho.13,rho.12,1,rho.23,rho.13,rho.23,1),3,3)
sigma.cases <- var.cases*matrix(c(1,rho.12,rho.13,rho.12,1,rho.23,rho.13,rho.23,1),3,3)
controls <- rmvnorm(n0, mean=rep(0,3), sigma=sigma.controls)
cases <- rmvnorm(n1, mean=rep(1.19,3), sigma=sigma.cases)
marker.samples <- rbind(controls,cases)
# Default method: KS statistic proposed in Martinez-Camblor by general bootstrap
output <- compareROCdep(marker.samples, D)
# L1 statistic proposed in Martinez-Camblor by general bootstrap
output1 <- compareROCdep(marker.samples, D, statistic="L1")
# CR statistic proposed in Martinez-Camblor by permutation method
output2 <- compareROCdep(marker.samples, D, method="permutation", statistic="CR")
# Venkatraman statistic
output3 <- compareROCdep(marker.samples, D, statistic="VK")
# DeLong AUC comparison methodology
output4 <- compareROCdep(marker.samples, D, method="auc")