pooledROC.emp {ROCnReg} | R Documentation |
Empirical estimation of the pooled ROC curve.
Description
This function estimates the pooled ROC curve using the empirical estimator proposed by Hsieh and Turnbull (1996).
Usage
pooledROC.emp(marker, group, tag.h, data,
p = seq(0, 1, l = 101), B = 1000, ci.level = 0.95,
method = c("ncoutcome", "coutcome"), pauc = pauccontrol(),
parallel = c("no", "multicore", "snow"), ncpus = 1, cl = NULL)
Arguments
marker |
A character string with the name of the diagnostic test variable. |
group |
A character string with the name of the variable that distinguishes healthy from diseased individuals. |
tag.h |
The value codifying healthy individuals in the variable |
data |
Data frame representing the data and containing all needed variables. |
p |
Set of false positive fractions (FPF) at which to estimate the pooled ROC curve. |
B |
An integer value specifying the number of bootstrap resamples for the construction of the confidence intervals. The default is 1000. |
ci.level |
An integer value (between 0 and 1) specifying the confidence level. The default is 0.95. |
method |
A character string specifying if bootstrap resampling (for the confidence intervals) should be done with or without regard to the disease status (“coutcome” or “noutcome”). In both cases, a naive bootstrap is used. By default, the resampling is done conditionally on the disease status. |
pauc |
A list of control values to replace the default values returned by the function |
parallel |
A characters string with the type of parallel operation: either "no" (default), "multicore" (not available on Windows) or "snow". |
ncpus |
An integer with the number of processes to be used in parallel operation. Defaults to 1. |
cl |
An object inheriting from class |
Details
Estimates the pooled ROC curve (ROC) defined as
ROC(p) = 1 - F_{D}\{F_{\bar{D}}^{-1}(1-p)\},
where
F_{D}(y) = Pr(Y_{D} \leq y),
F_{\bar{D}}(y) = Pr(Y_{\bar{D}} \leq y).
The method implemented in this function estimates F_{D}(\cdot)
and F_{\bar{D}}(\cdot)
by means of the empirical dsitributions. More precisely, and letting \{y_{\bar{D}i}\}_{i=1}^{n_{\bar{D}}}
and \{y_{Dj}\}_{j=1}^{n_{D}}
be two independent random samples from the nondiseased and diseased populations, respectively, the distribution functions in each group take the form
\widehat{F}_{D}(y)=\frac{1}{n_D}\sum_{j=1}^{n_D}I(y_{Dj}\leq y),
\widehat{F}_{\bar{D}}(y)=\frac{1}{n_{\bar{D}}}\sum_{i=1}^{n_{\bar{D}}}I(y_{\bar{D}i}\leq y).
The area under the curve is
AUC=\int_{0}^{1}ROC(p)dp
and is estimated empirically by means of the Mann-Whitney U-statistic. With regard to the partial area under the curve, when focus = "FPF"
and assuming an upper bound u_1
for the FPF, what it is computed is
pAUC_{FPF}(u_1)=\int_0^{u_1} ROC(p)dp,
where again is estimated empirically. The returned value is the normalised pAUC, pAUC_{FPF}(u_1)/u_1
so that it ranges from u_1/2
(useless test) to 1 (perfect marker). Conversely, when focus = "TPF"
, and assuming a lower bound for the TPF of u_2
, the partial area corresponding to TPFs lying in the interval (u_2,1)
is computed as
pAUC_{TPF}(u_2)=\int_{u_2}^{1}ROC_{TNF}(p)dp,
where ROC_{TNF}(p)
is a 270^\circ
rotation of the ROC curve, and it can be expressed as ROC_{TNF}(p) = F_{\bar{D}}\{F_{D}^{-1}(1-p)\}.
Again, ROC_{TNF}(p)
is estimated empirically. The returned value is the normalised pAUC, pAUC_{TPF}(u_2)/(1-u_2)
, so that it ranges from (1-u_2)/2
(useless test) to 1 (perfect test).
Value
As a result, the function provides a list with the following components:
call |
The matched call. |
marker |
A list with the diagnostic test outcomes in the healthy (h) and diseased (d) groups. |
missing.ind |
A logical value indicating whether missing values occur. |
p |
Set of false positive fractions (FPF) at which the pooled ROC curve has been estimated. |
ci.level |
The value of the argument |
ROC |
Estimated pooled ROC curve, and corresponding |
AUC |
Estimated pooled AUC, and corresponding |
pAUC |
If computed, estimated partial area under the pooled ROC curve along with its |
References
Hsieh, F., and Turnbull, B.W. (1996). Nonparametric and semiparametric estimation of the receiver operating characteristic curve, The Annals of Statistics, 24, 25–40.
See Also
AROC.bnp
, AROC.sp
, AROC.kernel
, pooledROC.BB
, pooledROC.emp
, pooledROC.kernel
, pooledROC.dpm
, cROC.bnp
, cROC.sp
or AROC.kernel
.
Examples
library(ROCnReg)
data(psa)
# Select the last measurement
newpsa <- psa[!duplicated(psa$id, fromLast = TRUE),]
# Log-transform the biomarker
newpsa$l_marker1 <- log(newpsa$marker1)
m0_emp <- pooledROC.emp(marker = "l_marker1", group = "status",
tag.h = 0, data = newpsa, p = seq(0,1,l=101), B = 10,
method = "coutcome", pauc = pauccontrol(compute = TRUE, value = 0.5, focus = "FPF"))
summary(m0_emp)
plot(m0_emp)