AROC.sp {ROCnReg} | R Documentation |
Semiparametric frequentist inference for the covariate-adjusted ROC curve (AROC).
Description
This function estimates the covariate-adjusted ROC curve (AROC) using the semiparametric approach proposed by Janes and Pepe (2009).
Usage
AROC.sp(formula.h, group, tag.h, data,
est.cdf.h = c("normal", "empirical"), pauc = pauccontrol(),
p = seq(0, 1, l = 101), B = 1000, ci.level = 0.95,
parallel = c("no", "multicore", "snow"), ncpus = 1, cl = NULL)
Arguments
formula.h |
A |
group |
A character string with the name of the variable that distinguishes healthy from diseased individuals. |
tag.h |
The value codifying healthy individuals in the variable |
data |
A data frame representing the data and containing all needed variables. |
est.cdf.h |
A character string. It indicates how the conditional distribution function of the diagnostic test in the healthy population is estimated. Options are |
pauc |
A list of control values to replace the default values returned by the function |
p |
Set of false positive fractions (FPF) at which to estimate the covariate-adjusted ROC curve. This set is also used to compute the area under the covariate-adjusted ROC curve (AAUC) using Simpson's rule. Thus, the length of the set should be an odd number, and it should be rich enough for an accurate estimation. |
B |
An integer value specifying the number of bootstrap resamples for the construction of the confidence intervals. The default is 1000. |
ci.level |
An integer value (between 0 and 1) specifying the confidence level. The default is 0.95. |
parallel |
A characters string with the type of parallel operation: either "no" (default), "multicore" (not available on Windows) or "snow". |
ncpus |
An integer with the number of processes to be used in parallel operation. Defaults to 1. |
cl |
An object inheriting from class |
Details
Estimates the covariate-adjusted ROC curve (AROC) defined as
AROC\left(p\right) = Pr\{1 - F_{\bar{D}}(Y_D | \mathbf{X}_{D}) \leq p\},
F_{\bar{D}}(y|\mathbf{x}) = Pr\{Y_{\bar{D}} \leq y | \mathbf{X}_{\bar{D}} = \mathbf{x}\}.
The method implemented in this function estimates the outer probability empirically (see Janes and Pepe, 2009) and F_{\bar{D}}(\cdot|\mathbf{x})
is estimated assuming a semiparametric location regression model for Y_{\bar{D}}
, i.e.,
Y_{\bar{D}} = \mathbf{X}_{\bar{D}}^{T}\mathbf{\beta}_{\bar{D}} + \sigma_{\bar{D}}\varepsilon_{\bar{D}},
where \varepsilon_{\bar{D}}
has zero mean, variance one, and distribution function G_{\bar{D}}
. As a consequence, we have
F_{\bar{D}}(y | \mathbf{x}) = G_{\bar{D}}\left(\frac{y-\mathbf{x}^{T}\mathbf{\beta}_{\bar{D}}}{\sigma_{\bar{D}}}\right).
In line with the assumptions made about the distribution of \varepsilon_{\bar{D}}
, estimators will be referred to as: (a) "normal", where a standard Gaussian error is assumed, i.e., G_{\bar{D}}(y) = \Phi(y)
; and, (b) "empirical", where no assumption is made about the distribution (in this case, G_{\bar{D}}
is empirically estimated on the basis of standardised residuals).
The area under the AROC curve is
AAUC=\int_0^1 AROC(p)dp,
and there exists a closed-form estimator. With regard to the partial area under the AROC curve, when focus = "FPF"
and assuming an upper bound u_1
for the FPF, what it is computed is
pAAUC_{FPF}(u_1)=\int_0^{u_1} AROC(p)dp,
where again there exists a closed-form estimator. The returned value is the normalised pAAUC, pAAUC_{FPF}(u_1)/u_1
so that it ranges from u_1/2
(useless test) to 1 (perfect marker). Conversely, when focus = "TPF"
, and assuming a lower bound for the TPF of u_2
, the partial area corresponding to TPFs lying in the interval (u_2,1)
is computed as
pAAUC_{TPF}(u_2)=\int_{AROC^{-1}(u_2)}^{1}AROC(p)dp-\{1-AROC^{-1}(u_2)\}\times u_2.
Here, the computation of the integral is done numerically. The returned value is the normalised pAAUC, pAAUC_{TPF}(u_2)/(1-u_2)
, so that it ranges from (1-u_2)/2
(useless test) to 1 (perfect test).
Value
As a result, the function provides a list with the following components:
call |
The matched call. |
data |
The original supplied data argument. |
missing.ind |
A logical value indicating whether for each pair of observations (test outcomes and covariates) missing values occur. |
marker |
The name of the diagnostic test variable in the dataframe. |
group |
The value of the argument |
tag.h |
The value of the argument |
formula |
The value of the argument |
est.cdf.h |
The value of the argument |
p |
Set of false positive fractions (FPF) at which the covariate-adjusted ROC (AROC) curve has been estimated |
ci.level |
The value of the argument |
ROC |
Estimated covariate-adjusted ROC curve (AROC), and |
AUC |
Estimated area under the covariate-adjusted ROC curve (AAUC), and |
pAUC |
If computed, estimated partial area under the covariate-adjusted ROC curve (pAAUC) and |
fit |
Object of class |
coeff |
Estimated regression coefficients (and |
References
Janes, H., and Pepe, M.S. (2009). Adjusting for covariate effects on classification accuracy using the covariate-adjusted receiver operating characteristic curve. Biometrika, 96(2), 371 - 382.
See Also
AROC.bnp
, AROC.sp
, AROC.kernel
, pooledROC.BB
, pooledROC.emp
, pooledROC.kernel
, pooledROC.dpm
, cROC.bnp
, cROC.sp
or AROC.kernel
.
Examples
library(ROCnReg)
data(psa)
# Select the last measurement
newpsa <- psa[!duplicated(psa$id, fromLast = TRUE),]
# Log-transform the biomarker
newpsa$l_marker1 <- log(newpsa$marker1)
m3 <- AROC.sp(formula.h = l_marker1 ~ age,
group = "status",
tag.h = 0,
data = newpsa,
est.cdf.h = "normal",
pauc = pauccontrol(compute = TRUE, focus = "FPF", value = 0.5),
p = seq(0,1,l=101),
B = 500)
summary(m3)
plot(m3)