R: Find the optimal testing configuration for group testing...

OTC2 {binGroup2}

R Documentation

Find the optimal testing configuration for group testing algorithms that use a multiplex assay for two diseases

Description

Find the optimal testing configuration (OTC) using non-informative and informative hierarchical and array-based group testing algorithms. Multiplex assays for two diseases are used at each stage of the algorithms.

Usage

OTC2(
  algorithm,
  p.vec = NULL,
  probabilities = NULL,
  alpha = NULL,
  Se,
  Sp,
  ordering = matrix(data = c(0, 1, 0, 1, 0, 0, 1, 1), nrow = 4, ncol = 2),
  group.sz,
  trace = TRUE,
  print.time = TRUE,
  ...
)

Arguments

`algorithm`	character string defining the group testing algorithm to be used. Non-informative testing options include two-stage hierarchical ("`D2`"), three-stage hierarchical ("`D3`"), square array testing without master pooling ("`A2`"), and square array testing with master pooling ("`A2M`"). Informative testing options include two-stage hierarchical ("`ID2`") and three-stage hierarchical ("`ID3`") testing.
`p.vec`	vector of overall joint probabilities. The joint probabilities are assumed to be equal for all individuals in the algorithm (non-informative testing only). There are four joint probabilities to consider: `p_{00}`, the probability that an individual tests negative for both diseases; `p_{10}`, the probability that an individual tests positive only for the first disease; `p_{01}`, the probability that an individual tests positive only for the second disease; and `p_{11}`, the probability that an individual tests positive for both diseases. The joint probabilities must sum to 1. Only one of `p.vec`, `probabilities`, or `alpha` should be specified.
`probabilities`	matrix of joint probabilities for each individual, where rows correspond to the four joint probabilities and columns correspond to each individual in the algorithm. Only one of `p.vec`, `probabilities`, or `alpha` should be specified.
`alpha`	vector containing positive shape parameters of the Dirichlet distribution (for informative testing only). The vector will be used to generate a heterogeneous matrix of joint probabilities for each individual. The vector must have length 4. Further details are given under 'Details'. Only one of `p.vec`, `probabilities`, or `alpha` should be specified.
`Se`	matrix of sensitivity values, where one value is given for each disease (or infection) at each stage of testing. The rows of the matrix correspond to each disease `k=1,2`, and the columns of the matrix correspond to each stage of testing `s=1,...,S`. If a vector of 2 values is provided, the sensitivity values associated with disease `k` are assumed to be equal to the `k`th value in the vector for all stages of testing. Further details are given under 'Details'.
`Sp`	matrix of specificity values, where one value is given for each disease (or infection) at each stage of testing. The rows of the matrix correspond to each disease `k=1,2`, and the columns of the matrix correspond to each stage of testing `s=1,...,S`. If a vector of 2 values is provided, the specificity values associated with disease `k` are assumed to be equal to the `k`th value in the vector for all stages of testing. Further details are given under 'Details'.
`ordering`	matrix detailing the ordering for the binary responses of the diseases. The columns of the matrix correspond to each disease and the rows of the matrix correspond to each of the 4 sets of binary responses for two diseases. This ordering is used with the joint probabilities. The default ordering is (p_00, p_10, p_01, p_11).
`group.sz`	single group size or range of group sizes for which to calculate operating characteristics and/or find the OTC. The details of group size specification are given under 'Details'.
`trace`	a logical value indicating whether the progress of calculations should be printed for each initial group size provided by the user. The default is `TRUE`.
`print.time`	a logical value indicating whether the length of time for calculations should be printed. The default is `TRUE`.
`...`	additional arguments to be passed to functions for hierarchical testing with multiplex assays for two diseases.

Details

This function finds the OTC for standard group testing algorithms with a multiplex assay that tests for two diseases and computes the associated operating characteristics. Calculations for hierarchical group testing algorithms are performed as described in Bilder et al. (2019) and calculations for array-based group testing algorithms are performed as described in Hou et al. (2019).

Available algorithms include two- and three-stage hierarchical testing and array testing with and without master pooling. Both non-informative and informative group testing settings are allowed for hierarchical algorithms. Only non-informative group testing settings are allowed for array testing algorithms. Operating characteristics calculated are expected number of tests, pooling sensitivity, pooling specificity, pooling positive predictive value, and pooling negative predictive value for each individual.

For informative algorithms where the alpha argument is specified, a heterogeneous matrix of joint probabilities for each individual is generated using the Dirichlet distribution. This is done using rBeta2009::rdirichlet and requires the user to set a seed to reproduce results. See Bilder et al. (2019) for additional details on the use of the Dirichlet distribution for this purpose.

The sensitivity/specificity values are allowed to vary across stages of testing. For hierarchical testing, a different sensitivity/specificity value may be used for each stage of testing. For array testing, a different sensitivity/specificity value may be used for master pool testing (if included), row/column testing, and individual testing. The values must be specified in the order of the testing performed. For example, values are specified as (stage 1, stage 2, stage 3) for three-stage hierarchical testing or (master pool testing, row/column testing, individual testing) for array testing with master pooling. A vector of 2 sensitivity/specificity values may be specified, and sensitivity/specificity values for all stages of testing are assumed to be equal. The first value in the vector will be used at each stage of testing for the first disease, and the second value in the vector will be used at each stage of testing for the second disease.

The value(s) specified by group.sz represent the initial (stage 1) group size for hierarchical testing and the row/column size for array testing. If a single value is provided for group.sz with two-stage hierarchical or array testing, operating characteristics will be calculated and no optimization will be performed. If a single value is provided for group.sz with three-stage hierarchical, the OTC will be found over all possible configurations with this initial group size. If a range of group sizes is specified, the OTC will be found over all group sizes.

In addition to the OTC, operating characteristics for some of the other configurations corresponding to each initial group size provided by the user are displayed. For algorithms where there is only one configuration for each initial group size (non-informative two-stage hierarchical and all array testing algorithms), results for each initial group size are provided. For algorithms where there is more than one possible configuration for each initial group size (informative two-stage hierarchical and all three-stage hierarchical algorithms), two sets of configurations are provided: 1) the best configuration for each initial group size, and 2) the top 10 configurations for each initial group size provided by the user. If a single value is provided for group.sz with array testing or non-informative two-stage hierarchical testing, operating characteristics will not be provided for configurations other than that specified by the user. Results are sorted by the value of the objective function per individual, value.

The displayed overall pooling sensitivity, pooling specificity, pooling positive predictive value, and pooling negative predictive value are weighted averages of the corresponding individual accuracy measures for all individuals within the initial group (or block) for a hierarchical algorithm, or within the entire array for an array-based algorithm. Expressions for these averages are provided in the Supplementary Material for Hitt et al. (2019). These expressions are based on accuracy definitions given by Altman and Bland (1994a, 1994b). Individual accuracy measures can be calculated using the operatingCharacteristics2 (opChar2) function.

Value

A list containing:

`algorithm`	the group testing algorithm used for calculations.
`prob.vec`	the vector of joint probabilities provided by the user, if applicable (for non-informative algorithms only).
`joint.p`	the matrix of joint probabilities for each individual provided by the user, if applicable.
`alpha.vec`	the alpha vector provided by the user, if applicable (for informative algorithms only).
`Se`	the matrix of sensitivity values for each disease at each stage of testing.
`Sp`	the matrix of specificity values for each disease at each stage of testing.
`opt.ET`	a list containing: OTC a list specifying elements of the optimal testing configuration, which may include: Stage1 group size for the first stage of hierarchical testing, if applicable. Stage2 group sizes for the second stage of hierarchical testing, if applicable. Block.sz the block size/initial group size for informative Dorfman testing, which is not tested. pool.szs group sizes for the first stage of testing for informative Dorfman testing. Array.dim the row/column size for array testing. Array.sz the overall array size for array testing (the square of the row/column size). p.mat the matrix of joint probabilities for each individual in the algorithm. Each row corresponds to one of the four joint probabilities. Each column corresponds to an individual in the testing algorithm. ET the expected testing expenditure for the OTC. value the value of the expected number of tests per individual. Accuracy the matrix of overall accuracy measures for the algorithm. The rows correspond to each disease. The columns correspond to the pooling sensitivity, pooling specificity, pooling positive predictive value, and pooling negative predictive value for the overall algorithm. Further details are given under 'Details'.
`Configs`	a data frame containing results for the best configuration for each initial group size provided by the user. The columns correspond to the initial group size, configuration (if applicable), overall array size (if applicable), expected number of tests, value of the objective function per individual, and accuracy measures for each disease. Accuracy measures include the pooling sensitivity, pooling specificity, pooling positive predictive value, and pooling negative predictive value. No results are displayed if a single `group.sz` is provided. Further details are given under 'Details'.
`Top.Configs`	a data frame containing results for some of the top configurations for each initial group size provided by the user. The columns correspond to the initial group size, configuration, expected number of tests, value of the objective function per individual, and accuracy measures for each disease. Accuracy measures include the pooling sensitivity, pooling specificity, pooling positive predictive value, and pooling negative predictive value. No results are displayed for non-informative two-stage hierarchical testing or for array testing algorithms. Further details are given under 'Details'.
`group.sz`	Initial group (or block) sizes examined to find the OTC.

Note

This function returns the pooling positive and negative predictive values for all individuals even though these measures are diagnostic specific; e.g., the pooling positive predictive value should only be considered for those individuals who have tested positive.

Additionally, only stage dependent sensitivity and specificity values are allowed within the program (no group within stage dependent values are allowed). See Bilder et al. (2019) for additional information.

Author(s)

This function was written by Brianna D. Hitt. It calls ET.all.stages.new and PSePSpAllStages, which were originally written by Christopher Bilder for Bilder et al. (2019), and ARRAY, which was originally written by Peijie Hou for Hou et al. (2020). The functions ET.all.stages.new, PSePSpAllStages, and ARRAY were obtained from http://chrisbilder.com/grouptesting/. Minor modifications were made to the functions for inclusion in the binGroup2 package.

References

Altman, D., Bland, J. (1994). “Diagnostic tests 1: Sensitivity and specificity.” BMJ, 308, 1552.

Altman, D., Bland, J. (1994). “Diagnostic tests 2: Predictive values.” BMJ, 309, 102.

Bilder, C., Tebbs, J., McMahan, C. (2019). “Informative group testing for multiplex assays.” Biometrics, 75, 278–288.

Hitt, B., Bilder, C., Tebbs, J., McMahan, C. (2019). “The objective function controversy for group testing: Much ado about nothing?” Statistics in Medicine, 38, 4912–4923.

Hou, P., Tebbs, J., Wang, D., McMahan, C., Bilder, C. (2021). “Array testing with multiplex assays.” Biostatistics, 21, 417–431.

McMahan, C., Tebbs, J., Bilder, C. (2012a). “Informative Dorfman Screening.” Biometrics, 68, 287–296.

Examples


# Find the OTC for non-informative two-stage
#   hierarchical (Dorfman) testing
Se <- matrix(data = c(0.95, 0.95, 0.99, 0.99), nrow = 2, ncol = 2,
             dimnames = list(Infection = 1:2, Stage = 1:2))
Sp <- matrix(data = c(0.96, 0.96, 0.98, 0.98), nrow = 2, ncol = 2,
             dimnames = list(Infection = 1:2, Stage = 1:2))
OTC2(algorithm = "D2", p.vec = c(0.90, 0.04, 0.04, 0.02),
     Se = Se, Sp = Sp, group.sz = 2:10)

# Find the OTC over all possible testing configurations
#   for informative two-stage hierarchical (Dorfman)
#   testing with a specified group size.
# A matrix of joint probabilities for each individual is
#   generated using the Dirichlet distribution.
Se <- matrix(data = rep(0.95, 4), nrow = 2, ncol = 2,
             dimnames = list(Infection = 1:2, Stage = 1:2))
Sp <- matrix(data = rep(0.99, 4), nrow = 2, ncol = 2,
             dimnames = list(Infection = 1:2, Stage = 1:2))
set.seed(1002)
OTC2(algorithm = "ID2", alpha = c(18.25, 0.75, 0.75, 0.25),
     Se = Se, Sp = Sp, group.sz = 18:22)

# Find the OTC for non-informative three-stage
#   hierarchical testing.
Se <- matrix(data = rep(0.95, 6), nrow = 2, ncol = 3,
             dimnames = list(Infection = 1:2, Stage = 1:3))
Sp <- matrix(data = rep(0.99, 6), nrow = 2, ncol = 3,
             dimnames = list(Infection = 1:2, Stage = 1:3))
OTC2(algorithm = "D3", p.vec = c(0.91, 0.04, 0.04, 0.01),
     Se = Se, Sp = Sp, group.sz = 3:12)

# Find the OTC over all possible configurations
#   for informative three-stage hierarchical
#   testing with a specified group size
#   and a heterogeneous matrix of joint
#   probabilities for each individual.
set.seed(8791)
Se <- matrix(data = rep(0.95, 6), nrow = 2, ncol = 3,
             dimnames = list(Infection = 1:2, Stage = 1:3))
Sp <- matrix(data = rep(0.99, 6), nrow = 2, ncol = 3,
             dimnames = list(Infection = 1:2, Stage = 1:3))
p.unordered <- t(rBeta2009::rdirichlet(n = 8,
                            shape = c(18.25, 0.75, 0.75, 0.25)))
p.ordered <- p.unordered[, order(1 - p.unordered[1,])]
OTC2(algorithm = "ID3", probabilities = p.ordered,
         Se = Se, Sp = Sp, group.sz = 8,
         trace = FALSE, print.time = FALSE)

# Find the OTC for non-informative array testing
#   without master pooling.
Se <- matrix(data = rep(0.95, 4), nrow = 2, ncol = 2,
             dimnames = list(Infection = 1:2, Stage = 1:2))
Sp <- matrix(data = rep(0.99, 4), nrow = 2, ncol = 2,
             dimnames = list(Infection = 1:2, Stage = 1:2))
OTC2(algorithm = "A2", p.vec = c(0.90, 0.04, 0.04, 0.02),
     Se = Se, Sp = Sp, group.sz = 2:10)

# Find the OTC for non-informative array testing
#   with master pooling.
Se <- matrix(data = rep(0.95, 6), nrow = 2, ncol = 3,
             dimnames = list(Infection = 1:2, Stage = 1:3))
Sp <- matrix(data = rep(0.99, 6), nrow = 2, ncol = 3,
             dimnames = list(Infection = 1:2, Stage = 1:3))
OTC2(algorithm = "A2M", p.vec = c(0.90, 0.04, 0.04, 0.02),
     Se = Se, Sp = Sp, group.sz = 10,
     trace = FALSE, print.time = FALSE)

[Package binGroup2 version 1.3.1 Index]