R: Average Run Length for Bernoulli CUSUM

bernoulli_ARL {success}

R Documentation

Average Run Length for Bernoulli CUSUM

Description

This function allows to estimate the Average Run Length (ARL) of the risk-adjusted Bernoulli CUSUM (see bernoulli_cusum()) through a Markov Chain Approach (Brook & Evans(1972) & Steiner et al. (2000)) or exploiting the relationship with the Sequential Probability Ratio Test (Kemp (1971)). The function requires the specification of one of the following combinations of parameters as arguments to the function:

glmmod & theta
p0 & theta
p0 & p1

Average run length of lower-sided Bernoulli CUSUM charts can be determined by specifying theta < 0.

Usage

bernoulli_ARL(h, n_grid, glmmod, theta, theta_true, p0, p1, method = c("MC",
  "SPRT"), smooth_prob = FALSE)

Arguments

`h`	Control limit for the Bernoulli CUSUM
`n_grid`	Number of state spaces used to discretize the outcome space (when `method = "MC"`) or number of grid points used for trapezoidal integration (when `method = "SPRT"`). Increasing this number improves accuracy, but can also significantly increase computation time.
`glmmod`	Generalized linear regression model used for risk-adjustment as produced by the function `glm()`. Suggested: `glm(as.formula("(survtime <= followup) & (censorid == 1) ~ covariates"), data = data)`. Alternatively, a list containing the following elements: `formula`: a `formula()` in the form `~ covariates`; `coefficients`: a named vector specifying risk adjustment coefficients for covariates. Names must be the same as in `formula` and colnames of `data`.
`theta`	The `\theta` value used to specify the odds ratio `e^\theta` under the alternative hypothesis. If `\theta >= 0`, the average run length for the upper one-sided Bernoulli CUSUM will be determined. If `\theta < 0`, the average run length for the lower one-sided CUSUM will be determined. Note that `p_1 = \frac{p_0 e^\theta}{1-p_0 +p_0 e^\theta}.`
`theta_true`	The true log odds ratio `\theta`, describing the true increase in failure rate from the null-hypothesis. Default = log(1), indicating no increase in failure rate.
`p0`	The baseline failure probability at `entrytime + followup` for individuals.
`p1`	The alternative hypothesis failure probability at `entrytime + followup` for individuals.
`method`	The method used to obtain the average run length. Either "MC" for Markov Chain or "SPRT" for SPRT methodology. Default = "MC".
`smooth_prob`	Should the probability distribution of failure under the null distribution be smoothed? Useful for small samples. Can only be TRUE when `glmmod` is supplied. Default = FALSE.

Details

The average run length of a CUSUM chart S_n is given by E[\tau_n], where \tau_n is defined as:

\tau_n = \inf\{n \geq 0: S_n \geq h\}.

When method = "MC", the average run length will be determined by the Markov Chain approach described in Brook & Evans (1972), using the risk-adjustment correction proposed in Steiner et al. (2000). The idea is to discretize the domain (0, h) into n_{grid} -1 state spaces, with E_0 of width w/2 and E_1, \ldots, E_{n_{grid}-1} of width w, such that E_{n_{grid}} is an absorbing state. This is done using the following steps:

w is determined using the relationship \frac{2h}{2t-1}.
Transition probabilities between the states are determined and 'transition matrix' R is constructed.
The equation (\bm{I}-\bm{R}) \bm{ARL} = \bm{1} is solved to find the ARL starting from each of the states.

When method = "SPRT", the average run length will be determined by the relationship between the SPRT and CUSUM described in Kemp (1971), using the risk-adjustment correction proposed in Steiner et al. (2000). If N is the run length of a SPRT, P(0) the probability of a SPRT terminating on the lower boundary of zero and R the run length of a CUSUM, then:

E[R] = \frac{E[N]}{1 - P(0)}.

E[N] and P(0) are completely determined by

G_n(z) = \int_0^h F(z-w) dG_{n-1}(w)

with F(x) the cdf of the singletons W_n. The integral can be approximated using the generalized trapezoidal quadrature rule:

G_n(z) = \sum_{i=0}^{n_{grid}-1} \frac{F(z-x_{i+1}) + F(z-x_{i})}{2} \left(G_{n-1}(x_{i+1}) - G_{n-1}(x_{i}) \right)

Value

A list containing:

ARL_0: A numeric value indicating the average run length in number of outcomes when starting from state E_0.
ARL: A data.frame containing the average run length (#outcomes) depending on the state in which the process starts (E_0, E_1, \ldots, E_{n_{grid}-1})

start_val:
Starting value of the CUSUM, corresponding to the discretized state spaces E_{i};

#outcomes:
ARL for the CUSUM with initial value start_val;
R: A transition probability matrix containing the transition probabilities between states E_0, \ldots, E_{t-1}. R_{i,j} is the transition probability from state i to state j.
h: Value of the control limit.

The value of ARL_0 will be printed to the console.

Author(s)

Daniel Gomon

References

Brook, D., & Evans, D. A. (1972). An Approach to the Probability Distribution of CUSUM Run Length. Biometrika, 59(3), 539-549. doi:10.2307/2334805

Steiner, S. H., Cook, R. J., Farewell, V. T., & Treasure, T. (2000). Monitoring surgical performance using risk-adjusted cumulative sum charts. Biostatistics, 1(4), 441-452. doi:10.1093/biostatistics/1.4.441

Kemp, K. W. (1971). Formal Expressions which Can Be Applied to Cusum Charts. Journal of the Royal Statistical Society. Series B (Methodological), 33(3), 331-360. doi:10.1111/j.2517-6161.1971.tb01521.x

Examples

#Determine a risk-adjustment model using a generalized linear model.
#Outcome (failure within 100 days) is regressed on the available covariates:
glmmodber <- glm((survtime <= 100) & (censorid == 1)~ age + sex + BMI,
                  data = surgerydat, family = binomial(link = "logit"))
#Determine the Average Run Length in number of outcomes for
#control limit h = 2.5 with (0, h) divided into n_grid = 200 segments
ARL <- bernoulli_ARL(h = 2.5, n_grid = 200, glmmod = glmmodber, theta = log(2))
#Calculate ARL, but now exploiting connection between SPRT and CUSUM:
#n_grid now decides the accuracy of the Trapezoidal rule for integral approximation
ARLSPRT <- bernoulli_ARL(h = 2.5, n_grid = 200, glmmod = glmmodber,
theta = log(2), method = "SPRT")

[Package success version 1.1.0 Index]