gtreg {binGroup} | R Documentation |

## Fitting Group Testing Models

### Description

`gtreg` is a function to fit the group testing regression model specified through a symbolic description of the linear predictor and descriptions of the group testing setting.

### Usage

```
gtreg(formula, data, groupn, retest = NULL, sens = 1,
spec = 1, linkf = c("logit", "probit", "cloglog"),
method = c("Vansteelandt", "Xie"), sens.ind = NULL,
spec.ind = NULL, start = NULL, control = gt.control(...), ...)
gtreg.fit(Y, X, groupn, sens, spec, linkf, start=NULL)
EM(Y, X, groupn, sens, spec, linkf, start = NULL,
control = gt.control())
EM.ret(Y, X, groupn, ret, sens, spec, linkf, sens.ind,
spec.ind, start = NULL, control = gt.control())
```

### Arguments

`formula` |
an object of class |

`data` |
an optional data frame, list or environment (or object coercible by |

`groupn` |
a vector, list or data frame of the group numbers that designates individuals to groups. |

`retest` |
a vector, list or data frame of individual retest results for Dorfman's retesting procedure. Default value is |

`sens` |
sensitivity of the test, set to be 1 by default. |

`spec` |
specificity of the test, set to be 1 by default. |

`sens.ind` |
sensitivity of the individual retests, set to be equal to |

`spec.ind` |
specificity of the individual retests, set to be equal to |

`linkf` |
a character string specifying one of the three link functions for a binomial model: |

`method` |
The method to fit the model, must be one of |

`start` |
starting values for the parameters in the linear predictor. |

`control` |
a list of parameters for controlling the fitting process in method |

`Y` |
For |

`X` |
For |

`ret` |
For |

`...` |
arguments to be passed by default to |

### Details

A typical predictor has the form `groupresp ~ covariates` where response is the (numeric) group response vector and `covariates` is a series of terms which specifies a linear predictor for individual responses. Note that it is actually the unobserved individual responses, not the observed group responses, which are modeled by the covariates here. In `groupresp`, a 0 denotes a negative response and a 1 denotes a positive response, where the probability of an individual positive response is being modeled directly. A terms specification of the form `first + second` indicates all the terms in `first` together with all the terms in `second` with duplicates removed. The `terms` in the formula will be re-ordered so that main effects come first, followed by the interactions, all second-order, all third-order and so on; to avoid this pass a terms object as the formula.

A specification of the form `first:second` indicates the set of terms obtained by taking the interactions of all terms in `first` with all terms in `second`. The specification `first*second` indicates the cross of `first` and `second`. This is the same as `first + second + first:second`.

Three workhorse functions `gtreg.fit`, `EM` and `EM.ret`, where the first corresponds to Vansteelandt's method and the last two corresponds to Xie's method, are called by `gtreg` to carry out the model fitting. The `gtreg.fit` function uses the `optim` function with default method `"Nelder-Mead"` to maximize the likelihood function of the observed group responses. If this optimization method produces a Hessian matrix of all zero elements, the `"SANN"` method in `optim` is employed to find the coefficients and Hessian matrix. For `"SANN"` method, the number of iterations in `optim` is set to be 10000.

The `EM` and `EM.ret` function apply Xie's EM algorithm to the likelihood function written in terms of the unobserved individual responses; the functions use `glm.fit` to update the parameter estimates within each M step. The `EM` function is used when there are no retests and `EM.ret` is used when individual retests are available. Thus, within `retest`, individual observations in observed positive groups are 0 (negative) or 1 (positive); the remaining individual observations are `NA`s meaning that no retest is performed for them. Retests cannot be used with Vansteelandt's method; a warning message will be given in this case, and the individual retests will be ignored in the model fitting. There could be slight differences in the estimates between the Vansteelandt's and Xie's methods (when retests are not available) due to different convergence criteria.

The data used here should be in the form of simple pooling - meaning that each individual appears in exactly one pool. When only the group responses are observed, the null degrees of freedom are the number of groups minus 1 and the residual degrees of freedom are the number of groups minus the number of parameters. When individual retests are observed too, it is an open research question for what the degrees of freedom and the deviance for the null model should
be; therefore the degrees of freedom and `null.deviance` will not be displayed.

For the background on the use of `optim`, see `help(optim)`.

### Value

`gtreg` returns an object of class `"gt"`. See later in this section.
The function `summary` (i.e., `summary.gt`

) can be used to obtain or print a summary of the results.
The group testing functions `predict` (i.e., `predict.gt`

) and `residuals` (i.e., `residuals.gt`

) can be used to extract various useful features of the value returned by `gtreg`.
An object of class `"gt"` is a list containing at least the following components:

`coefficients` |
a named vector of coefficients |

`hessian` |
estimated Hessian matrix of the negative log likelihood function, serves as an estimate of the information matrix |

`residuals` |
the response residuals, difference of the observed group responses and the fitted group responses. |

`fitted.values` |
the fitted mean values of group responses. |

`deviance` |
the deviance between the fitted model and the saturated model. |

`aic` |
Akaike's An Information Criterion, minus twice the maximized log-likelihood plus twice the number of coefficients |

`null.deviance` |
The deviance for the null model, comparable with |

`counts` |
For Vansteelandt's method: the number of iterations in |

`df.residual` |
the residual degrees of freedom. |

`df.null` |
the residual degrees of freedom for the null model. |

`z` |
the vector of group responses. |

`call` |
the matched call. |

`formula` |
the formula supplied. |

`terms` |
the terms object used. |

`method` |
the method ( |

`link` |
the link function used in the model. |

### Author(s)

Boan Zhang

### References

*Xie, M. (2001)*, Regression analysis of group testing samples, *Statistics in Medicine, 20, 1957-1969.*

*Vansteelandt, S., Goetghebeur, E., and Verstraeten, T. (2000)*, Regression models for disease prevalence with diagnostic tests on pools of serum samples, *Biometrics, 56, 1126-1133.*

### See Also

`summary.gt`

, `predict.gt`

and `residuals.gt`

for `gt` methods.
`gtreg.mp`

for the group testing regression model in the matrix pooling setting.

### Examples

```
data(hivsurv)
fit1 <- gtreg(formula = groupres ~ AGE + EDUC., data = hivsurv,
groupn = gnum, sens = 0.9, spec = 0.9, method = "Xie")
fit1
## --- Continuing the Example from '?sim.gt':
set.seed(46)
gt.data <- sim.gt(par = c(-12, 0.2), sample.size = 700, group.size = 5)
fit2 <- gtreg(formula = gres ~ x, data = gt.data, groupn = groupn)
fit2
set.seed(21)
gt.data <- sim.gt(par = c(-12, 0.2), sample.size = 700, group.size = 6,
sens = 0.95, spec = 0.95, sens.ind = 0.98, spec.ind = 0.98)
fit1 <- gtreg(formula = gres ~ x, data = gt.data, groupn = groupn,
retest = retest, method = "X", sens = 0.95, spec = 0.95, sens.ind = 0.98,
spec.ind = 0.98, trace = TRUE)
summary(fit1)
```

*binGroup*version 2.2-1 Index]