AROC.sp {AROC}R Documentation

Semiparametric frequentist inference of the covariate-adjusted ROC curve (AROC).

Description

Estimates the covariate-adjusted ROC curve (AROC) using the semiparametric approach proposed by Janes and Pepe (2009).

Usage

AROC.sp(formula.healthy, group, tag.healthy, data, 
	est.surv.h = c("normal", "empirical"), p = seq(0, 1, l = 101), B = 1000)

Arguments

formula.healthy

A formula object specifying the location regression model to be fitted in healthy population (see Details).

group

A character string with the name of the variable that distinguishes healthy from diseased individuals.

tag.healthy

The value codifying the healthy individuals in the variable group.

data

Data frame representing the data and containing all needed variables.

est.surv.h

A character string. It indicates how the conditional distribution function of the diagnostic test in healthy population is estimated. Options are "normal" and "empirical" (see Details). The default is "normal".

p

Set of false positive fractions (FPF) at which to estimate the covariate-adjusted ROC curve.

B

An integer value specifying the number of bootstrap resamples for the construction of the confidence intervals. By default 1000.

Details

Estimates the covariate-adjusted ROC curve (AROC) defined as

AROC\left(t\right) = Pr\{1 - F_{\bar{D}}(Y_D | \mathbf{X}_{D}) \leq t\},

where F_{\bar{D}}(\cdot|\mathbf{X}_{\bar{D}}) denotes the conditional distribution function for Y_{\bar{D}} conditional on the vector of covariates \mathbf{X}_{\bar{D}}. In particular, the method implemented in this function estimates the outer probability empirically (see Janes and Pepe, 2008) and F_{\bar{D}}(\cdot|\mathbf{X}_{\bar{D}}) is estimated assuming a semiparametric location regression model for Y_{\bar{D}}, i.e.,

Y_{\bar{D}} = \mathbf{X}_{\bar{D}}^{T}\mathbf{\beta}_{\bar{D}} + \sigma_{\bar{D}}\varepsilon_{\bar{D}},

such that, for a random sample \{(\mathbf{x}_{\bar{D}i})\}_{i=1}^{n_{\bar{D}}} from the healthy population, we have

F_{\bar{D}}(y | \mathbf{X}_{\bar{D}}=\mathbf{x}_{\bar{D}i}) = F_{\bar{D}}\left(\frac{y-\mathbf{x}_{\bar{D}i}^{T}\mathbf{\beta}_{\bar{D}}}{\sigma_{\bar{D}}}\right),

where F_{\bar{D}} is the distribution function of \varepsilon_{\bar{D}}. In line with the assumptions made about the distribution of \varepsilon_{\bar{D}}, estimators will be referred to as: (a) "normal", where Gaussian error is assumed, i.e., F_{\bar{D}}(y) = \Phi(y); and, (b) "empirical", where no assumption is made about the distribution (in this case, the distribution function F_{\bar{D}} is empirically estimated on the basis of standardised residuals).

Value

As a result, the function provides a list with the following components:

call

The matched call.

p

Set of false positive fractions (FPF) at which the pooled ROC curve has been estimated

ROC

Estimated covariate-adjusted ROC curve (AROC), and 95% pointwise confidence intervals (if required)

AUC

Estimated area under the covariate-adjusted ROC curve (AAUC), and 95% pointwise confidence intervals (if required).

fit.h

Object of class lm with the fitted regression model in the healthy population.

est.surv.h

The value of the argument est.surv.h used in the call.

References

Janes, H., and Pepe, M.S. (2009). Adjusting for covariate effects on classification accuracy using the covariate-adjusted receiver operating characteristic curve. Biometrika, 96(2), 371 - 382.

See Also

AROC.bnp, AROC.bsp, AROC.sp, AROC.kernel, pooledROC.BB or pooledROC.emp.

Examples

library(AROC)
data(psa)
# Select the last measurement
newpsa <- psa[!duplicated(psa$id, fromLast = TRUE),]

# Log-transform the biomarker
newpsa$l_marker1 <- log(newpsa$marker1)

m3 <- AROC.sp(formula.healthy = l_marker1 ~ age,
group = "status", tag.healthy = 0, data = newpsa,
p = seq(0,1,l=101), B = 500)

summary(m3)

plot(m3)



[Package AROC version 1.0-4 Index]