| pooledROC.emp {ROCnReg} | R Documentation |
Empirical estimation of the pooled ROC curve.
Description
This function estimates the pooled ROC curve using the empirical estimator proposed by Hsieh and Turnbull (1996).
Usage
pooledROC.emp(marker, group, tag.h, data,
p = seq(0, 1, l = 101), B = 1000, ci.level = 0.95,
method = c("ncoutcome", "coutcome"), pauc = pauccontrol(),
parallel = c("no", "multicore", "snow"), ncpus = 1, cl = NULL)
Arguments
marker |
A character string with the name of the diagnostic test variable. |
group |
A character string with the name of the variable that distinguishes healthy from diseased individuals. |
tag.h |
The value codifying healthy individuals in the variable |
data |
Data frame representing the data and containing all needed variables. |
p |
Set of false positive fractions (FPF) at which to estimate the pooled ROC curve. |
B |
An integer value specifying the number of bootstrap resamples for the construction of the confidence intervals. The default is 1000. |
ci.level |
An integer value (between 0 and 1) specifying the confidence level. The default is 0.95. |
method |
A character string specifying if bootstrap resampling (for the confidence intervals) should be done with or without regard to the disease status (“coutcome” or “noutcome”). In both cases, a naive bootstrap is used. By default, the resampling is done conditionally on the disease status. |
pauc |
A list of control values to replace the default values returned by the function |
parallel |
A characters string with the type of parallel operation: either "no" (default), "multicore" (not available on Windows) or "snow". |
ncpus |
An integer with the number of processes to be used in parallel operation. Defaults to 1. |
cl |
An object inheriting from class |
Details
Estimates the pooled ROC curve (ROC) defined as
ROC(p) = 1 - F_{D}\{F_{\bar{D}}^{-1}(1-p)\},
where
F_{D}(y) = Pr(Y_{D} \leq y),
F_{\bar{D}}(y) = Pr(Y_{\bar{D}} \leq y).
The method implemented in this function estimates F_{D}(\cdot) and F_{\bar{D}}(\cdot) by means of the empirical dsitributions. More precisely, and letting \{y_{\bar{D}i}\}_{i=1}^{n_{\bar{D}}} and \{y_{Dj}\}_{j=1}^{n_{D}} be two independent random samples from the nondiseased and diseased populations, respectively, the distribution functions in each group take the form
\widehat{F}_{D}(y)=\frac{1}{n_D}\sum_{j=1}^{n_D}I(y_{Dj}\leq y),
\widehat{F}_{\bar{D}}(y)=\frac{1}{n_{\bar{D}}}\sum_{i=1}^{n_{\bar{D}}}I(y_{\bar{D}i}\leq y).
The area under the curve is
AUC=\int_{0}^{1}ROC(p)dp
and is estimated empirically by means of the Mann-Whitney U-statistic. With regard to the partial area under the curve, when focus = "FPF" and assuming an upper bound u_1 for the FPF, what it is computed is
pAUC_{FPF}(u_1)=\int_0^{u_1} ROC(p)dp,
where again is estimated empirically. The returned value is the normalised pAUC, pAUC_{FPF}(u_1)/u_1 so that it ranges from u_1/2 (useless test) to 1 (perfect marker). Conversely, when focus = "TPF", and assuming a lower bound for the TPF of u_2, the partial area corresponding to TPFs lying in the interval (u_2,1) is computed as
pAUC_{TPF}(u_2)=\int_{u_2}^{1}ROC_{TNF}(p)dp,
where ROC_{TNF}(p) is a 270^\circ rotation of the ROC curve, and it can be expressed as ROC_{TNF}(p) = F_{\bar{D}}\{F_{D}^{-1}(1-p)\}. Again, ROC_{TNF}(p) is estimated empirically. The returned value is the normalised pAUC, pAUC_{TPF}(u_2)/(1-u_2), so that it ranges from (1-u_2)/2 (useless test) to 1 (perfect test).
Value
As a result, the function provides a list with the following components:
call |
The matched call. |
marker |
A list with the diagnostic test outcomes in the healthy (h) and diseased (d) groups. |
missing.ind |
A logical value indicating whether missing values occur. |
p |
Set of false positive fractions (FPF) at which the pooled ROC curve has been estimated. |
ci.level |
The value of the argument |
ROC |
Estimated pooled ROC curve, and corresponding |
AUC |
Estimated pooled AUC, and corresponding |
pAUC |
If computed, estimated partial area under the pooled ROC curve along with its |
References
Hsieh, F., and Turnbull, B.W. (1996). Nonparametric and semiparametric estimation of the receiver operating characteristic curve, The Annals of Statistics, 24, 25–40.
See Also
AROC.bnp, AROC.sp, AROC.kernel, pooledROC.BB, pooledROC.emp, pooledROC.kernel, pooledROC.dpm, cROC.bnp, cROC.sp or AROC.kernel.
Examples
library(ROCnReg)
data(psa)
# Select the last measurement
newpsa <- psa[!duplicated(psa$id, fromLast = TRUE),]
# Log-transform the biomarker
newpsa$l_marker1 <- log(newpsa$marker1)
m0_emp <- pooledROC.emp(marker = "l_marker1", group = "status",
tag.h = 0, data = newpsa, p = seq(0,1,l=101), B = 10,
method = "coutcome", pauc = pauccontrol(compute = TRUE, value = 0.5, focus = "FPF"))
summary(m0_emp)
plot(m0_emp)