R: Calculate AUC when gold standard is continuous with large...

cgAUC {cgAUC}

R Documentation

Calculate AUC when gold standard is continuous with large variables.

Description

The cgAUC can calculate the AUC-type measure of Obuchowski(2006) when gold standard is continuous, and find the optimal linear combination of variables with respect to this measure.

Usage

cgAUC(x, z, h, delta = 1, auto = FALSE, tau = 1, scale = 1)

Arguments

`x`	The potential variables. It is a matrix with column of values of a variables. It should be standardized in this application.
`z`	The gold standard variable. It should be standardized.
`h`	The parameter controls the window width of smoothing function.
`delta`	The parameter be used in TGDM. The default value is one.
`auto`	Find the optimal delta in TGDN using cross-validation. If the auto is TRUE. The default is FALSE.
`tau`	The parameter used in TGDM. The default value is one.
`scale`	Scaling data when scale = 1, no scaling data when scale = 0. The default value is 1.

Details

In this package, we use the TGDM to find the optimal linear combination of variables in order to maximize the AUC-type measure. Before using this function, all of variables, including gold standard variable, should be standardized first. Below are parameters used in the algorithm:

Value

`Rev`	When Rev = 0 means l * 1; otherwise, l * -1.
`l`	The estimate of coefficients for the optimal linear combination of variables.
`theta.sh.h.p`	The estimate of the theta of Chang(2012) for the optimal linear combination of variables.
`theta.sh.h.p.var`	The estimate of variance for the theta of Chang(2012).
`cntin.ri`	The estimate of the theta of Chang(2012) for each single vaiable.
`theta.h.p`	The estimate of the theta of Obuchowski(2006) for the optimal linear combination of variables.
`theta.h.p.var`	The estimate of variance for the theta of Obuchowski(2006).
`dscrt.ri`	The estimate of the theta of Obuchowski(2006) for each single vaiable.
`delta`	The value of delta.

Author(s)

Yu-chia Chang

References

Chang, YCI. Maximizing an ROC type measure via linear combination of markers when the gold reference is continuous. Statistics in Medicine 2012.
Obuchowski NA. An ROC-type measure of diagnostic accuracy when the gold standard is continuous-scale. Statistics in Medicine 2006; 25:481–493.
Obuchowski N. Estimating and comparing diagnostic tests accuracy when the gold standard is not binary. Statistics in Medicine 2005; 20:3261–3278.
Friedman JH, Popescu BE. Gradient directed regularization for linear regression and classification. Technical Report, Department of Statistics, Stanford University, 2004.

Examples

##---- Should be DIRECTLY executable !! ----
##-- ==>  Define data, use random,
##--	or do  help(data=index)  for the standard data sets.

# n = 100; p = 5;
# r.x = matrix(rnorm(n * p), , p) # raw data
# r.z = r.x[ ,1] + rnorm(n) # gold standard
# x = scale(r.x) # standardized of raw data
# z = scale(r.z) # standardized of gold standard
# h = n^(-1 / 2)
# t1 = cgAUC(r.x, r.z, h, delta = 1, auto = FALSE, tau = 1, scale = 1) # the delta be constant
# t1
# t2 = cgAUC(r.x, r.z, h, delta = 1, auto = TRUE, tau = 1, scale = 1) # the delta be variable
# t2

## The function is currently defined as
function (x, z, h, delta = 1, auto = FALSE, tau = 1)
{
	x = scale(x)
	z = scale(z)
	conv = FALSE
	n = dim(x)[1]
	p = dim(x)[2]
	cntin.ri = dscrt.ri = rep(0, p)
	id = diag(p)
	for (i in 1:p) {
		dscrt.ri[i] = dscrt(x, z, id[i, ])$theta.h.p
		cntin.ri[i] = cntin(x, z, id[i, ], h)$theta.sh.h.p
	}
	beta.i = ifelse(cntin.ri > 0.5, 1, -1)
	dscrt.ri = ifelse(dscrt.ri > 0.5, dscrt.ri, (1 - dscrt.ri))
	cntin.ri = ifelse(cntin.ri > 0.5, cntin.ri, (1 - cntin.ri))
	y = x * matrix(beta.i, n, p, byrow = TRUE)
	max.x = which(cntin.ri == max(cntin.ri))
	theta.sh.h.p = 0
	l = id[max.x, ]
	while (conv == FALSE) {
		d.l = d.theta.sh.h.p(y, z, l, h)
		max.d.l = max(d.l)
		ind.d.l = ifelse(d.l >= (tau * max.d.l), 1, 0) * d.l
		if (auto == TRUE) {
			delta = optimal.delta(y, z, l, h, ind.d.l)
		}
		l = l + delta * ind.d.l
		l = l/max(l)
		theta.temp = cntin(y, z, l, h)$theta.sh.h.p
		ifelse(abs(theta.temp - theta.sh.h.p) < 1e-04, conv <- TRUE, conv <- FALSE)
		theta.sh.h.p = theta.temp
	}
	optimal.dscrt = dscrt(y, z, l)
	theta.sh.h.p.var = cntin(y, z, l, h)$var
	l = l * beta.i
	return(list(l = l, theta.sh.h.p = theta.sh.h.p, theta.sh.h.p.var = theta.sh.h.p.var,
				cntin.ri = cntin.ri, theta.h.p = optimal.dscrt$theta.h.p,
				theta.h.p.var = optimal.dscrt$var, dscrt.ri = dscrt.ri,
				delta = delta))
}
## The function is currently defined as
function (x, z, h, delta = 1, auto = FALSE, tau = 1) 
{
    x = scale(x)
    z = scale(z)
    conv = FALSE
    n = dim(x)[1]
    p = dim(x)[2]
    cntin.ri = dscrt.ri = rep(0, p)
    id = diag(p)
    for (i in 1:p) {
        dscrt.ri[i] = dscrt(x, z, id[i, ])$theta.h.p
        cntin.ri[i] = cntin(x, z, id[i, ], h)$theta.sh.h.p
    }
    beta.i = ifelse(cntin.ri > 0.5, 1, -1)
    dscrt.ri = ifelse(dscrt.ri > 0.5, dscrt.ri, (1 - dscrt.ri))
    cntin.ri = ifelse(cntin.ri > 0.5, cntin.ri, (1 - cntin.ri))
    y = x * matrix(beta.i, n, p, byrow = TRUE)
    max.x = which(cntin.ri == max(cntin.ri))
    theta.sh.h.p = 0
    l = id[max.x, ]
    while (conv == FALSE) {
        d.l = d.theta.sh.h.p(y, z, l, h)
        max.d.l = max(d.l)
        ind.d.l = ifelse(d.l >= (tau * max.d.l), 1, 0) * d.l
        if (auto == TRUE) {
            delta = optimal.delta(y, z, l, h, ind.d.l)
        }
        l = l + delta * ind.d.l
        l = l/max(l)
        theta.temp = cntin(y, z, l, h)$theta.sh.h.p
        ifelse(abs(theta.temp - theta.sh.h.p) < 1e-04, conv <- TRUE, 
            conv <- FALSE)
        theta.sh.h.p = theta.temp
    }
    optimal.dscrt = dscrt(y, z, l)
    theta.sh.h.p.var = cntin(y, z, l, h)$var
    l = l * beta.i
    return(list(l = l, theta.sh.h.p = theta.sh.h.p, theta.sh.h.p.var = theta.sh.h.p.var, 
        cntin.ri = cntin.ri, theta.h.p = optimal.dscrt$theta.h.p, 
        theta.h.p.var = optimal.dscrt$var, dscrt.ri = dscrt.ri, 
        delta = delta))
  }

[Package cgAUC version 1.2.1 Index]