compareROCindep {nsROC} | R Documentation |
Comparison of k independent ROC curves
Description
This function compares k ROC curves from independent data. Different statistics can be considered in order to perform the comparison: those ones included in Martinez-Camblor et al. (2011) based on distances, the Venkatraman (2000) methodology for comparing curves for continuous unpaired data and one based in AUC (area under the curve) comparison. See References below.
Usage
compareROCindep(X, G, D, ...)
## Default S3 method:
compareROCindep(X, G, D, statistic=c("L1","L2","CR","other","VK","AUC"),
FUN.stat.int=function(roc.i, roc){mean(abs(roc.i - roc))},
FUN.stat.cons=function(n.cases, n.controls){sqrt(n.cases)},
side=c("right","left"), Ni=1000, raw=FALSE, perm=500,
seed=123, plot.roc=TRUE, type='s', lwd=3,
lwd.curves=rep(2,length(table(G))), lty=1,
lty.curves=rep(1,length(table(G))), col='black',
col.curves=rainbow(length(table(G))), cex.lab=1.2,
legend=c(sapply(1:length(table(G)),function(i){
eval(bquote(expression(hat(R)[.(i)](t))))}),
expression(hat(R)(t))), legend.position='bottomright',
legend.inset=0.03, cex.legend=1, ...)
Arguments
X |
vector of (bio)marker values. |
G |
vector of group identifier values (it should have as levels as independent samples to compare). |
D |
the vector of response values. |
statistic |
the statistic used in order to compare the curves. One of "L1" ( |
FUN.stat.int |
a function of two variables, |
FUN.stat.cons |
a function of two variables, |
side |
type of ROC curve. One of "right" or "left". If |
Ni |
number of subintervals of the unit interval (FPR values) considered to calculate the curve. Default: 1000. |
raw |
if TRUE, raw data is considered; if FALSE, data is ranked and a method to break ties in the permutations is considered (see Venkatraman (2000) in References). Default: FALSE. |
perm |
number of permutations. Default: 500. |
seed |
seed considered to generate the permutations (for reproducibility). Default: 123. |
plot.roc |
if TRUE, a plot including ROC curve estimates for the k samples and the mean of all of them is displayed. |
type |
what type of plot should be drawn. |
lwd |
the line width to be used for mean ROC curve estimate. |
lwd.curves |
a vector with the line widths to be used for ROC curve estimates of each sample. |
lty |
the line type to be used for mean ROC curve estimate. |
lty.curves |
a vector with the line types to be used for ROC curve estimates of each sample. |
col |
the color to be used for mean ROC curve estimate. |
col.curves |
a vector with the colors to be used for ROC curve estimates of each sample. |
cex.lab |
the magnification to be used for x and y labels relative to the current setting of |
legend |
a character or expression vector to appear in the legend. |
legend.position , legend.inset , cex.legend |
the position of the legend, the inset distance from the margins as a fraction of the plot region when legend is placed, and the character expansion factor relative to current |
... |
another graphical parameters to be passed. |
Details
If the Venkatraman statistic is chosen in order to compare left-sided ROC curves, an error will be displayed and it will not work. The Venkatraman methodology is just implemented for right-sided ROC curves.
If raw=FALSE
the data will be ranked in each sample using the rank
function with ties.method='first'
option. Furthermore, the permutation samples possible ties will be broken using ties.method='random'
option.
The statistic is defined by \sum_{i=1}^k
statistic.cons
\cdot
statistic.int
where statistic.cons
= FUN.stat.cons
('number of cases in the i-th sample', 'number of controls in the i-th sample') and statistic.int
= FUN.stat.int
('ROC curve estimate from the i-th sample', 'mean ROC curve estimate along the k samples'). It is usual to consider the function FUN.stat.int
as an integral of a distance between \hat{R}_i(t)
and \hat{R}(t)
where \hat{R}(t) := k^{-1} \sum_{i=1}^k \hat{R}_i(t)
.
The statistics implemented are defined by the following FUN.stat.cons
and FUN.stat.int
functions:
statistic="L1"
:FUN.stat.int(roc.i, roc) = mean(abs(roc.i - roc))
FUN.stat.cons(n.cases, n.controls) = sqrt(n.cases)
statistic="L2"
:FUN.stat.int(roc.i, roc) = mean((roc.i - roc)^2)
FUN.stat.cons(n.cases, n.controls) = n.cases
statistic="CR"
:FUN.stat.int(roc.i, roc) = mean((roc.i[seq(2,2*Ni+1,2)] -
roc[seq(2,2*Ni+1,2)])^2 * (roc[seq(3,2*Ni+1,2)] - roc[seq(1,2*Ni-1,2)]))
.FUN.stat.cons(n.cases, n.controls) = n.cases
In order to use this statistic, the ROC curves have been estimated in a grid with
2*Ni
subintervals of the unit interval.
The permutation method proposed in Venkatraman (2000) is used in order to generate the perm
samples in all methodologies (i.e., any statistic
).
In case of statistic="VK"
the Venkatraman methodology (see References below) is computed to calculate the statistic. If k>2
the statistic value is the sum of the statistic values of each pair such that i < j
.
In case of statistic="AUC"
, the statistic considered is k^{-1} \sum_{i=1}^k | \widehat{AUC}_i - \widehat{AUC} |
where \hat{AUC}
is the mean of \hat{AUC}_i
along the k samples.
Value
n.controls |
vector of number of controls in each sample. |
n.cases |
vector of number of cases in each sample. |
controls.k |
a vector of all controls along the k samples, ordered by sample. |
cases.k |
a vector of all cases along the k samples, ordered by sample. |
statistic |
the value of the test statistic. |
stat.perm |
a vector of statistic values for permutations. |
p.value |
the p-value for the test. |
References
Venkatraman E.S., 2000, A permutation test to compare receiver operating characteristic curves, Biometrics, 56, 1134-1138.
Martinez-Camblor P., Carleos C., Corral N., 2011, Powerful nonparametric statistics to compare k independent ROC curves, Journal of Applied Statistics, 38(7), 1317-1332.
Examples
set.seed(123)
X1 <- c(rnorm(45), rnorm(30,2,1.5))
D1 <- c(rep(0,45), rep(1,30))
X2 <- c(rnorm(45), rnorm(38,3,1.5))
D2 <- c(rep(0,45), rep(1,38))
X3 <- c(rnorm(30), rnorm(42,3,1))
D3 <- c(rep(0,30), rep(1,42))
X <- c(X1, X2, X3)
D <- c(D1, D2, D3)
G <- c(rep(1,75), rep(2,83), rep(3,72))
# Default method: L1 statistic proposed in Martinez-Camblor
output <- compareROCindep(X, G, D)
# Venkatraman statistic
output1 <- compareROCindep(X, G, D, statistic="VK")
# DeLong AUC comparison methodology
output2 <- compareROCindep(X, G, D, statistic="AUC")