OTC1 {binGroup2}  R Documentation 
Find the optimal testing configuration (OTC) using noninformative and informative hierarchical and arraybased group testing algorithms. Singledisease assays are used at each stage of the algorithms.
OTC1( algorithm, p = NULL, probabilities = NULL, Se = 0.99, Sp = 0.99, group.sz, obj.fn = "ET", weights = NULL, alpha = 2, trace = TRUE, print.time = TRUE, ... )
algorithm 
character string defining the group testing algorithm to be used. Noninformative testing options include twostage hierarchical ("D2"), threestage hierarchical ("D3"), square array testing without master pooling ("A2"), and square array testing with master pooling ("A2M"). Informative testing options include twostage hierarchical ("ID2"), threestage hierarchical ("ID3"), and square array testing without master pooling ("IA2"). 
p 
overall probability of disease that will be used to generate a
vector/matrix of individual probabilities. For noninformative algorithms,
a homogeneous set of probabilities will be used. For informative
algorithms, the 
probabilities 
a vector of individual probabilities, which is homogeneous for noninformative testing algorithms and heterogeneous for informative testing algorithms. Either p or probabilities should be specified, but not both. 
Se 
a vector of sensitivity values, where one value is given for each stage of testing (in order). If a single value is provided, sensitivity values are assumed to be equal to this value for all stages of testing. Further details are given under 'Details'. 
Sp 
a vector of specificity values, where one value is given for each stage of testing (in order). If a single value is provided, specificity values are assumed to be equal to this value for all stages of testing. Further details are given under 'Details'. 
group.sz 
a single group size or range of group sizes for which to calculate operating characteristics and/or find the OTC. The details of group size specification are given under 'Details'. 
obj.fn 
a list of objective functions which are minimized to find the OTC. The expected number of tests per individual, "ET", will always be calculated. Additional options include "MAR" (the expected number of tests divided by the expected number of correct classifications, described in Malinovsky et al. (2016)), and "GR" (a linear combination of the expected number of tests, the number of misclassified negatives, and the number of misclassified positives, described in Graff & Roeloffs (1972)). See Hitt et al. (2019) for additional details. The first objective function specified in this list will be used to determine the results for the top configurations. Further details are given under 'Details'. 
weights 
a matrix of up to six sets of weights for the GR function. Each set of weights is specified by a row of the matrix. 
alpha 
a shape parameter for the beta distribution that specifies the degree of heterogeneity for the generated probability vector (for informative testing only). 
trace 
a logical value indicating whether the progress of calculations should be printed for each initial group size provided by the user. The default is TRUE. 
print.time 
a logical value indicating whether the length of time for calculations should be printed. The default is TRUE. 
... 
arguments to be passed to the 
This function finds the OTC for group testing algorithms with an assay that tests for one disease and computes the associated operating characteristics, as described in Hitt et al. (2019).
Available algorithms include two and threestage hierarchical testing and array testing with and without master pooling. Both noninformative and informative group testing settings are allowed for each algorithm, except informative array testing with master pooling is unavailable because this method has not appeared in the group testing literature. Operating characteristics calculated are expected number of tests, pooling sensitivity, pooling specificity, pooling positive predictive value, and pooling negative predictive value for each individual.
For informative algorithms where the p argument is specified, the
expected value of order statistics from a beta distribution are found.
These values are used to represent disease risk probabilities for each
individual to be tested. The beta distribution has two parameters: a mean
parameter p (overall disease prevalence) and a shape parameter
alpha (heterogeneity level). Depending on the specified p,
alpha, and overall group size, simulation may be necessary to
generate the vector of individual probabilities. This is done using
expectOrderBeta
and requires the user to set a seed to
reproduce results.
Informative twostage hierarchical (Dorfman) testing is implemented via the poolspecific optimal Dorfman (PSOD) method described in McMahan et al. (2012a), where the greedy algorithm proposed for PSOD is replaced by considering all possible testing configurations. Informative array testing is implemented via the gradient method (the most efficient array design), where higherrisk individuals are grouped in the leftmost columns of the array. For additional details on the gradient arrangement method for informative array testing, see McMahan et al. (2012b).
The sensitivity/specificity values are allowed to vary across stages of testing. For hierarchical testing, a different sensitivity/specificity value may be used for each stage of testing. For array testing, a different sensitivity/specificity value may be used for master pool testing (if included), row/column testing, and individual testing. The values must be specified in order of the testing performed. For example, values are specified as (stage 1, stage 2, stage 3) for threestage hierarchical testing or (master pool testing, row/column testing, individual testing) for array testing with master pooling. A single sensitivity/specificity value may be specified instead. In this situation, sensitivity/specificity values for all stages are assumed to be equal.
The value(s) specified by group.sz represent the initial (stage 1) group size for hierarchical testing and the row/column size for array testing. For informative twostage hierarchical testing, the group.sz specified represents the block size used in the poolspecific optimal Dorfman (PSOD) method, where the initial group (block) is not tested. For more details on informative twostage hierarchical testing implemented via the PSOD method, see Hitt et al. (2019) and McMahan et al. (2012a).
If a single value is provided for group.sz with array testing or noninformative twostage hierarchical testing, operating characteristics will be calculated and no optimization will be performed. If a single value is provided for group.sz with threestage hierarchical or informative twostage hierarchical, the OTC will be found over all possible configurations. If a range of group sizes is specified, the OTC will be found over all group sizes.
In addition to the OTC, operating characteristics for some of the other configurations corresponding to each initial group size provided by the user will be displayed. These additional configurations are only determined for whichever objective function ("ET", "MAR", or "GR") is specified first in the function call. If "GR" is the objective function listed first, the first set of corresponding weights will be used. For algorithms where there is only one configuration for each initial group size (noninformative twostage hierarchical and all array testing algorithms), results for each initial group size are provided. For algorithms where there is more than one possible configuration for each initial group size (informative twostage hierarchical and all threestage hierarchical algorithms), two sets of configurations are provided: 1) the best configuration for each initial group size, and 2) the top 10 configurations for each initial group size provided by the user. If a single value is provided for group.sz with array testing or noninformative twostage hierarchical testing, operating characteristics will not be provided for configurations other than that specified by the user. Results are sorted by the value of the objective function per individual, value.
The displayed overall pooling sensitivity, pooling specificity, pooling
positive predictive value, and pooling negative predictive value are
weighted averages of the corresponding individual accuracy measures for all
individuals within the initial group (or block) for a hierarchical
algorithm, or within the entire array for an arraybased algorithm.
Expressions for these averages are provided in the Supplementary
Material for Hitt et al. (2019). These expressions are based on accuracy
definitions given by Altman and Bland (1994a, 1994b). Individual
accuracy measures can be calculated using the
operatingCharacteristics1
(opChar1
) function.
The OTC1 function accepts additional arguments, namely num.sim,
to be passed to the expectOrderBeta
function, which generates
a vector of probabilities for informative group testing algorithms. The
num.sim argument specifies the number of simulations from the beta
distribution when simulation is used. By default, 10,000 simulations are
used.
A list containing:
algorithm 
the group testing algorithm used for calculations. 
prob 
the probability of disease or the vector of individual probabilities, as specified by the user. 
alpha 
level of heterogeneity for the generated probability vector (for informative testing only). 
Se 
the vector of sensitivity values for each stage of testing. 
Sp 
the vector of specificity values for each stage of testing. 
opt.ET, opt.MAR, opt.GR 
a list of results for each objective function specified by the user, containing:

Configs 
a data frame containing results for the best configuration for each initial group size provided by the user. The columns correspond to the initial group size, configuration (if applicable), overall array size (if applicable), expected number of tests, value of the objective function per individual, pooling sensitivity, pooling specificity, pooling positive predictive value, and pooling negative predictive value. No results are displayed if a single group.sz is provided. Further details are given under 'Details'. 
Top.Configs 
a data frame containing results for some of the top configurations for each initial group size provided by the user. The columns correspond to the initial group size, configuration, expected number of tests, value of the objective function per individual, pooling sensitivity, pooling specificity, pooling positive predictive value, and pooling negative predictive value. No results are displayed for noninformative twostage hierarchical testing or for array testing algorithms. Further details are given under 'Details'. 
This function returns the pooling positive and negative predictive values for all individuals even though these measures are diagnostic specific; e.g., the pooling positive predictive value should only be considered for those individuals who have tested positive.
Additionally, only stage dependent sensitivity and specificity values are allowed within the program (no group within stage dependent values are allowed). See Bilder et al. (2019) for additional information.
Brianna D. Hitt
Altman, D., Bland, J. (1994). “Diagnostic tests 1: Sensitivity and specificity.” BMJ, 308, 1552.
Altman, D., Bland, J. (1994). “Diagnostic tests 2: Predictive values.” BMJ, 309, 102.
Bilder, C., Tebbs, J., McMahan, C. (2019). “Informative group testing for multiplex assays.” Biometrics, 75, 278–288. doi: 10.1111/biom.12988, https://doi.org/10.1111/biom.12988.
Graff, L., Roeloffs, R. (1972). “Group testing in the presence of test error; an extension of the Dorfman procedure.” Technometrics, 14, 113–122. doi: 10.1080/00401706.1972.10488888, https://doi.org/10.1080/00401706.1972.10488888.
Hitt, B., Bilder, C., Tebbs, J., McMahan, C. (2019). “The objective function controversy for group testing: Much ado about nothing?” Statistics in Medicine, 38, 4912–4923. doi: 10.1002/sim.8341, https://doi.org/10.1002/sim.8341.
Malinovsky, Y., Albert, P., Roy, A. (2016). “Reader reaction: A note on the evaluation of group testing algorithms in the presence of misclassification.” Biometrics, 72, 299–302. doi: 10.1111/biom.12385, https://doi.org/10.1111/biom.12385.
McMahan, C., Tebbs, J., Bilder, C. (2012a). “Informative Dorfman Screening.” Biometrics, 68, 287–296. doi: 10.1111/j.15410420.2011.01644.x, https://doi.org/10.1111/j.15410420.2011.01644.x.
McMahan, C., Tebbs, J., Bilder, C. (2012b). “TwoDimensional Informative Array Testing.” Biometrics, 68, 793–804. doi: 10.1111/j.15410420.2011.01726.x, https://doi.org/10.1111/j.15410420.2011.01726.x.
Other OTC functions:
OTC2()
# Estimated running time for all examples was calculated # using a computer with 16 GB of RAM and one core of # an Intel i76500U processor. Please take this into # account when interpreting the run times given. # Find the OTC for noninformative # twostage hierarchical (Dorfman) testing. OTC1(algorithm = "D2", p = 0.05, Se = 0.99, Sp = 0.99, group.sz = 2:100, obj.fn = "ET", trace = TRUE, print.time = TRUE) # Find the OTC for informative twostage hierarchical # (Dorfman) testing. # A vector of individual probabilities is generated using # the expected value of order statistics from a beta # distribution with p = 0.01 and a heterogeneity level # of alpha = 0.5. # This example takes approximately 2.5 minutes to run. set.seed(52613) OTC1(algorithm = "ID2", p = 0.01, Se = 0.95, Sp = 0.95, group.sz = 50, obj.fn = c("ET", "MAR", "GR"), weights = matrix(data = c(1, 1, 10, 10, 0.5, 0.5), nrow = 3, ncol = 2, byrow = TRUE), alpha = 0.5, trace = FALSE, print.time = TRUE, num.sim = 10000) # Find the OTC over all possible testing configurations # for noninformative threestage hierarchical testing # with a specified group size. OTC1(algorithm = "D3", p = 0.001, Se = 0.95, Sp = 0.95, group.sz = 18, obj.fn = "ET", trace = FALSE, print.time = FALSE) # Find the OTC for noninformative threestage # hierarchical testing. # This example takes approximately 20 seconds to run. OTC1(algorithm = "D3", p = 0.06, Se = 0.90, Sp = 0.90, group.sz = 3:30, obj.fn = c("ET", "MAR", "GR"), weights = matrix(data = c(1, 1, 10, 10, 100, 100), nrow = 3, ncol = 2, byrow = TRUE)) # Find the OTC over all possible configurations # for informative threestage hierarchical testing # with a specified group size and a heterogeneous # vector of probabilities. set.seed(1234) OTC1(algorithm = "ID3", probabilities = c(0.012, 0.014, 0.011, 0.012, 0.010, 0.015), Se = 0.99, Sp = 0.99, group.sz = 6, obj.fn = "ET", alpha = 0.5, num.sim = 5000, trace = FALSE) # Calculate the operating characteristics for # noninformative array testing without master pooling # with a specified array size. OTC1(algorithm = "A2", p = 0.005, Se = 0.95, Sp = 0.95, group.sz = 8, obj.fn = "ET", trace = FALSE) # Find the OTC for informative array testing without # master pooling. # A vector of individual probabilities is generated using # the expected value of order statistics from a beta # distribution with p = 0.03 and a heterogeneity level # of alpha = 2. The probabilities are then arranged in # a matrix using the gradient method. # This example takes approximately 40 seconds to run. set.seed(1002) OTC1(algorithm = "IA2", p = 0.03, Se = 0.95, Sp = 0.95, group.sz = 2:20, obj.fn = c("ET", "MAR", "GR"), weights = matrix(data = c(1, 1, 10, 10, 100, 100), nrow = 3, ncol = 2, byrow = TRUE), alpha = 2) # Find the OTC for noninformative array testing # with master pooling. # This example takes approximately 25 seconds to run. OTC1(algorithm = "A2M", p = 0.02, Se = 0.90, Sp = 0.90, group.sz = 2:20, obj.fn = "ET")