AROC.kernel {ROCnReg} | R Documentation |
Nonparametric kernel-based estimation of the covariate-adjusted ROC curve (AROC).
Description
This function estimates the covariate-adjusted ROC curve (AROC) using the nonparametric kernel-based method proposed by Rodriguez-Alvarez et al. (2011). The method, as it stands now, can only deal with one continuous covariate.
Usage
AROC.kernel(marker, covariate, group, tag.h,
bw = c("LS", "AIC"),
regtype = c("LC", "LL"),
pauc = pauccontrol(),
data, p = seq(0, 1, l = 101), B = 1000, ci.level = 0.95,
parallel = c("no", "multicore", "snow"), ncpus = 1, cl = NULL)
Arguments
marker |
A character string with the name of the diagnostic test variable. |
covariate |
A character string with the name of the continuous covariate. |
group |
A character string with the name of the variable that distinguishes healthy from diseased individuals. |
tag.h |
The value codifying healthy individuals in the variable |
bw |
A character string specifying which method to use to select the bandwidths. AIC specifies expected Kullback-Leibler cross-validation, and LS specifies least-squares cross-validation. Defaults to LS. For details see |
regtype |
A character string specifying which type of kernel estimator to use for the regression function (see Details). LC specifies a local-constant estimator (Nadaraya-Watson) and LL specifies a local-linear estimator. Defaults to LC. For details see |
pauc |
A list of control values to replace the default values returned by the function |
data |
A data frame representing the data and containing all needed variables. |
p |
Set of false positive fractions (FPF) at which to estimate the covariate-adjusted ROC curve. This set is also used to compute the area under the covariate-adjusted ROC curve (AAUC) using Simpson's rule. Thus, the length of the set should be an odd number and it should be rich enough for an accurate estimation. |
B |
An integer value specifying the number of bootstrap resamples for the construction of the confidence intervals. The default is 1000. |
ci.level |
An integer value (between 0 and 1) specifying the confidence level. The default is 0.95. |
parallel |
A characters string with the type of parallel operation: either "no" (default), "multicore" (not available on Windows) or "snow". |
ncpus |
An integer with the number of processes to be used in parallel operation. Defaults to 1. |
cl |
An object inheriting from class |
Details
Estimates the covariate-adjusted ROC curve (AROC) defined as
AROC\left(p\right) = Pr\{1 - F_{\bar{D}}(Y_D | X_{D}) \leq p\},
where F_{\bar{D}}(y|x) = Pr\{Y_{\bar{D}} \leq y | X_{\bar{D}} = x\}
. In particular, the method implemented in this function estimates the outer probability empirically (see Janes and Pepe, 2009) and F_{\bar{D}}(y|x)
is estimated assuming a nonparametric location-scale regression model for Y_{\bar{D}}
, i.e.,
Y_{\bar{D}} = \mu_{\bar{D}}(X_{\bar{D}}) + \sigma_{\bar{D}}(X_{\bar{D}})\varepsilon_{\bar{D}},
where \mu_{\bar{D}}(x) = E(Y_{\bar{D}} | X_{\bar{D}} = x)
is the regression funcion, \sigma^2_{\bar{D}}(x) = Var(Y_{\bar{D}} | X_{\bar{D}} = x)
is the variance function, and \varepsilon_{\bar{D}}
has zero mean, variance one, and distribution function G_{\bar{D}}
. As a consequence,
F_{\bar{D}}(y | x) = G_{\bar{D}}\left(\frac{y - \mu_{\bar{D}}(x)}{\sigma_{\bar{D}}(x)}\right).
By default, both the regression and variance functions are estimated using the Nadaraya-Watson estimator (LC), and the bandwidths are selected using least-squares cross-validation (LS). Implementation relies on the R
-package np
. No assumption is made about G_{\bar{D}}
, which is empirically estimated on the basis of the standardised residuals.
The area under the AROC curve is
AAUC=\int_0^1 AROC(p)dp,
and there exists a closed-form estimator. With regard to the partial area under the curve, when focus = "FPF"
and assuming an upper bound u_1
for the FPF, what it is computed is
pAAUC_{FPF}(u_1)=\int_0^{u_1} AROC(p)dp,
where again there exists a closed-form estimator. The returned value is the normalised pAAUC, pAAUC_{FPF}(u_1)/u_1
so that it ranges from u_1/2
(useless test) to 1 (perfect marker). Conversely, when focus = "TPF"
, and assuming a lower bound for the TPF of u_2
, the partial area corresponding to TPFs lying in the interval (u_2,1)
is computed as
pAAUC_{TPF}(u_2)=\int_{AROC^{-1}(u_2)}^{1}AROC(p)dp-\{1-AROC^{-1}(u_2)\}\times u_2.
Here, the computation of the integral is done numerically. The returned value is the normalised pAAUC, pAAUC_{TPF}(u_2)/(1-u_2)
, so that it ranges from (1-u_2)/2
(useless test) to 1 (perfect test).
Value
As a result, the function provides a list with the following components:
call |
The matched call. |
data |
The original supplied data argument. |
missing.ind |
A logical value indicating whether for each pair of observations (test outcomes and covariates) missing values occur. |
marker |
The name of the diagnostic test variable in the dataframe. |
covariate |
The value of the argument |
group |
The value of the argument |
tag.h |
The value of the argument |
p |
Set of false positive fractions (FPF) at which the covariate-adjusted ROC curve has been estimated. |
ci.level |
The value of the argument |
ROC |
Estimated covariate-adjusted ROC curve (AROC), and |
AUC |
Estimated area under the covariate-adjusted ROC curve (AAUC), and |
pAUC |
If computed, estimated partial area under the covariate-adjusted ROC curve (pAAUC) and |
fit |
List with the following components: (1) |
References
Hayfield, T., and Racine, J. S. (2008). Nonparametric Econometrics: The np Package. Journal of Statistical Software, 27(5). URL http://www.jstatsoft.org/v27/i05/.
Inacio de Carvalho, V., and Rodriguez-Alvarez, M. X. (2022). The Covariate-Adjusted ROC Curve: The Concept and Its Importance, Review of Inferential Methods, and a New Bayesian Estimator. Statistical Science, 37, 541 -561.
Janes, H., and Pepe, M.S. (2009). Adjusting for covariate effects on classification accuracy using the covariate-adjusted receiver operating characteristic curve. Biometrika, 96, 371–382.
Rodriguez-Alvarez, M. X., Roca-Pardinas, J., and Cadarso-Suarez, C. (2011). ROC curve and covariates: extending induced methodology to the non-parametric framework. Statistics and Computing, 21, 483–499.
See Also
AROC.bnp
, AROC.sp
, AROC.kernel
, pooledROC.BB
, pooledROC.emp
, pooledROC.kernel
, pooledROC.dpm
, cROC.bnp
, cROC.sp
or AROC.kernel
.
Examples
library(ROCnReg)
data(psa)
# Select the last measurement
newpsa <- psa[!duplicated(psa$id, fromLast = TRUE),]
# Log-transform the biomarker
newpsa$l_marker1 <- log(newpsa$marker1)
m2 <- AROC.kernel(marker = "l_marker1",
covariate = "age",
group = "status",
tag.h = 0,
data = newpsa,
bw = "LS",
regtype = "LC",
pauc = pauccontrol(compute = TRUE, focus = "FPF", value = 0.5),
B = 500)
summary(m2)
plot(m2)