AROC.bnp {AROC} | R Documentation |

## Nonparametric Bayesian inference of the covariate-adjusted ROC curve (AROC).

### Description

Estimates the covariate-adjusted ROC curve (AROC) using the nonparametric Bayesian approach proposed by Inacio de Carvalho and Rodriguez-Alvarez (2018).

### Usage

```
AROC.bnp(formula.healthy, group, tag.healthy, data, scale = TRUE,
p = seq(0, 1, l = 101), paauc = paauccontrol(),
compute.lpml = FALSE, compute.WAIC = FALSE,
m0, S0, nu, Psi, alpha = 1, a = 2, b = 0.5, L = 10, nsim = 10000, nburn = 2000)
```

### Arguments

`formula.healthy` |
A |

`group` |
A character string with the name of the variable that distinguishes healthy from diseased individuals. |

`tag.healthy` |
The value codifying the healthy individuals in the variable |

`data` |
Data frame representing the data and containing all needed variables. |

`scale` |
A logical value. If TRUE the test outcomes are scaled, i.e., are divided by the standard deviation. The default is TRUE. |

`p` |
Set of false positive fractions (FPF) at which to estimate the covariate-adjusted ROC curve. |

`paauc` |
A list of control values to replace the default values returned by the function |

`compute.lpml` |
A logical value. If TRUE, the log pseudo marginal likelihood (LPML, Geisser and Eddy, 1979) and the conditional predictive ordinates (CPO) are computed. |

`compute.WAIC` |
A logical value. If TRUE, the widely applicable information criterion (WAIC, Gelman et al., 2014; Watanabe, 2010) is computed. |

`m0` |
A numeric vector. Hyperparameter; mean vector of the (multivariate) normal prior distribution for the mean of the normal component of the centering distribution. If missing, it is set to a vector of zeros of length |

`S0` |
A numeric matrix. Hyperparameter; covariance matrix of the (multivariate) normal prior distribution for the mean of the normal component of the centering distribution. If missing, it is set to a diagonal matrix of dimension |

`nu` |
A numeric value. Hyperparameter; degrees of freedom of the Wishart prior distribution for the precision matrix of the the normal component of the centering distribution. If missing, it is set to |

`Psi` |
A numeric matrix. Hyperparameter; scale matrix of the Wishart distribution for the precision matrix of the the normal component of the centering distribution. If missing, it is set to an identity matrix of dimension |

`alpha` |
A numeric value. Precision parameter of the Dirichlet Process. The default is 1 (see Details). |

`a` |
A numeric value. Hyperparameter; shape parameter of the gamma prior distribution for the precision (inverse variance). The default is 2 (scaled data) (see Details). |

`b` |
A numeric value. Hyperparameter; rate parameter of the gamma prior distribution for the precision (inverse variance). The default is 0.5 (scaled data) (see Details). |

`L` |
A numeric value. Maximum number of mixture components for the B-splines dependent Dirichlet process mixture model. The default is 10 (see Details) |

`nsim` |
A numeric value. Total number of Gibbs sampler iterates (including the burn-in). The default is 10000. |

`nburn` |
A numeric value. Number of burn-in iterations. The default is 2000. |

### Details

Estimates the covariate-adjusted ROC curve (AROC) defined as

`AROC\left(t\right) = Pr\{1 - F_{\bar{D}}(Y_D | \mathbf{X}_{D}) \leq t\},`

where `F_{\bar{D}}(\cdot|\mathbf{X}_{\bar{D}})`

denotes the conditional distribution function for `Y_{\bar{D}}`

conditional on the vector of covariates `\mathbf{X}_{\bar{D}}`

. In particular, the method implemented in this function combines a B-splines dependent Dirichlet process mixture model to estimate `F_{\bar{D}}(\cdot|\mathbf{X}_{\bar{D}})`

and the Bayesian bootstrap (Rubin, 1981) to estimate the outside probability. More precisely, and letting `\{(\mathbf{x}_{\bar{D}i},y_{\bar{D}i})\}_{i=1}^{n_{\bar{D}}}`

be a random sample from the nondiseased population

`F_{\bar{D}}(y_{\bar{D}i}|\mathbf{X}_{\bar{D}}=\mathbf{x}_{\bar{D}i}) = \sum_{l=1}^{L}\omega_l\Phi(y_{\bar{D}i}\mid\mu_{l}(\mathbf{x}_{\bar{D}i}),\sigma_l^2),`

where `\mu_{l}(\mathbf{x}_{\bar{D}i}) = \mathbf{z}_{\bar{D}i}^{T}\mathbf{\beta}_l`

and `L`

is pre-specified (maximum number of mixture components). The `\omega_l`

's result from a truncated version of the stick-breaking construction (`\omega_1=v_1`

; `\omega_l=v_l\prod_{r<l}(1-v_r)`

, `l=2,\ldots,L`

; `v_1,\ldots,v_{L-1}\sim`

Beta `(1,\alpha)`

; `v_L=1`

), `\mathbf{\beta}_l\sim N_{Q}(\mathbf{m},\mathbf{S})`

, and `\sigma_l^{-2}\sim\Gamma(a,b)`

. It is assumed that `\mathbf{m} \sim N_{Q}(\mathbf{m}_0,\mathbf{S}_0)`

and `\mathbf{S}^{-1}\sim W(\nu,(\nu\Psi)^{-1})`

. Here `W(\nu,(\nu\Psi)^{-1})`

denotes a Wishart distribution with `\nu`

degrees of freedom and expectation `\Psi^{-1}`

, and `Q`

denotes the dimension of vector `\mathbf{z}_{\bar{D}i}`

. For a detailed description, we refer to Inacio de Carvalho and Rodriguez-Alvarez (2018).

### Value

As a result, the function provides a list with the following components:

`call` |
The matched call. |

`p` |
Set of false positive fractions (FPF) at which the pooled ROC curve has been estimated. |

`ROC` |
Estimated covariate-adjusted ROC curve (AROC) (posterior mean), and 95% pointwise posterior credible band. |

`AUC` |
Estimated area under the covariate-adjusted ROC curve (AAUC) (posterior mean), and 95% pointwise posterior credible band. |

`pAUC` |
If required, estimated partial area under the covariate-adjusted ROC curve (pAAUC) (posterior mean), and 95% pointwise posterior credible band. |

`lpml` |
If required, list with two components: the log pseudo marginal likelihood (LPML) and the conditional predictive ordinates (CPO). |

`WAIC` |
If required, widely applicable information criterion (WAIC). |

`fit` |
Results of the fitting process. It is a list with the following components: (1) |

`data_model` |
List with the data used in the fit: observed diagnostic test outcome and B-spline design matrices, separately for the healthy and diseased groups. |

### Note

The input argument `formula.healthy`

is similar to that used for the `glm`

function, except that flexible specifications can be added by means of function `f()`

. For instance, specification `y \sim x1 + f(x2, K = 3)`

would assume a linear effect of `x1`

and the effect of `x2`

would be modeled using B-splines basis functions. The argument `K = 3`

indicates that `3`

internal knots will be used, with the quantiles of `x2`

used for their location. Categorical variables (factors) can be also incorporated, as well as factor-by-curve interaction terms. For example, to include the interaction between `age`

and `gender`

we need to specify `y \sim gender + f(age, by = gender, K = 3)`

.

### References

Inacio de Carvalho, V., and Rodriguez-Alvarez, M. X. (2018). Bayesian nonparametric inference for the covariate-adjusted ROC curve. arXiv preprint arXiv:1806.00473.

Rubin, D. B. (1981). The Bayesian bootstrap. The Annals of Statistics, 9(1), 130-134.

### See Also

`AROC.bnp`

, `AROC.bsp`

, `AROC.sp`

, `AROC.kernel`

, `pooledROC.BB`

or `pooledROC.emp`

.

### Examples

```
library(AROC)
data(psa)
# Select the last measurement
newpsa <- psa[!duplicated(psa$id, fromLast = TRUE),]
# Log-transform the biomarker
newpsa$l_marker1 <- log(newpsa$marker1)
m0 <- AROC.bnp(formula.healthy = l_marker1 ~ f(age, K = 0),
group = "status", tag.healthy = 0, data = newpsa, scale = TRUE,
p = seq(0,1,l=101), compute.lpml = TRUE, compute.WAIC = TRUE,
a = 2, b = 0.5, L = 10, nsim = 5000, nburn = 1000)
summary(m0)
plot(m0)
```

*AROC*version 1.0-4 Index]