checkboxcox {rocbc}R Documentation

Tests whether Box-Cox is appropriate for the given dataset (one-marker version)

Description

This function tests whether the Box-Cox transformation is able to achieve approximate normality for your data. That is, it will allow the user to know whether it is appropriate to use all the methods discussed later on in this package.

Usage

checkboxcox(marker, D, plots, printShapiro = FALSE)

Arguments

marker

A vector of length n that contains the biomarker scores of all individuals.

D

A vector of length n that contains the true disease status of an individual. It is a binary vector containing 0 for the healthy/control individuals and 1 for the diseased individuals.

plots

Valid inputs are "on" and "off". When set to "on", the user gets the histograms of the biomarker for both the healthy and the diseased group before and after the Box-Cox transformation. In addition, all four corresponding qq-plots are provided.

printShapiro

Boolean. When set to TRUE, the results of the Shapiro-Wilk test will be printed to the console. When set to FALSE, the results are suppressed. Default value is FALSE.

Value

res_shapiro

A results table that contains the results of four Shapiro-Wilk tests for normality testing. Two of these refer to normality testing of the healthy and the diseased groups before the Box-Cox transformation, and the remaining two refer to the Box-Cox transformed biomarkers scores for the healthy and the diseased groups. Thus, this testing process produces four p-values. In addition, if the plots are set to 'on', then the output provides (1) the histograms of the biomarker for both the healthy and the diseased groups before and after the Box-Cox transformation, (2) all four corresponding qq-plots, and (3) a plot with the empirical ROC curve overplotted with the Box-Cox based ROC curve for visual comparison purposes.

transformation.parameter

The single transformation parameter, lambda, that is applied for both groups simultaneously.

transx

The Box-Cox transformed scores for the healthy.

transy

The Box-Cox transformed scores for the diseased.

pval_x

The p-value of the Shapiro Wilk test of normality for the healthy group (before the Box-Cox transformation).

pval_y

The p-value of the Shapiro Wilk test of normality for the diseased group (before the Box-Cox transformation).

pval_transx

The p-value of the Shapiro Wilk test of normality for the healthy group (after the Box-Cox transformation).

pval_transy

The p-value of the Shapiro Wilk test of normality for the diseased group (after the Box-Cox transformation).

roc

A function of the estimated Box-Cox ROC curve. You can use this to simply request TPR values for given FPR values.

Author(s)

Leonidas Bantis

References

Bantis LE, Nakas CT, Reiser B. (2021). Statistical inference for the difference between two maximized Youden indices obtained from correlated biomarkers. Biometrical Journal, 63(6):1241-1253. https://doi.org/10.1002/bimj.202000128

Bantis LE, Nakas CT, Reiser B. (2018). Construction of confidence intervals for the maximum of the Youden index and the corresponding cutoff point of a continuous biomarker. Biometrical Journal, 61(1):138-156. https://doi.org/10.1002/bimj.201700107

Bantis LE, Feng Z. (2016). Comparison of two correlated ROC curves at a given specificity or sensitivity level. Statistics in Medicine, 35(24):4352-4367. https://doi.org/10.1002/sim.7008

Bantis LE, Nakas CT, Reiser B. (2014). Construction of confidence regions in the ROC space after the estimation of the optimal Youden index-based cut-off point. Biometrics, 70(1):212-223. https://doi.org/10.1111/biom.12107

Box GEP, Cox DR. (1964). An Analysis of Transformations. Journal of the Royal Statistical Society. 26(2):211-252. https://www.jstor.org/stable/2984418

Examples

set.seed(123)
x <- rgamma(100, shape=2, rate = 8) # generates biomarker data from a gamma
                                 # distribution for the healthy group.
y <- rgamma(100, shape=2, rate = 4) # generates biomarker data from a gamma
                                 # distribution for the diseased group.
scores <- c(x,y)
D=c(pracma::zeros(1,100), pracma::ones(1,100))
out=checkboxcox(marker=scores, D, plots="on")
summary(out)

[Package rocbc version 3.1.0 Index]