brglmFit {brglm2} R Documentation

## Fitting function for `glm` for reduced-bias estimation and inference

### Description

`brglmFit` is a fitting method for `glm` that fits generalized linear models using implicit and explicit bias reduction methods (Kosmidis, 2014), and other penalized maximum likelihood methods. Currently supported methods include the mean bias-reducing adjusted scores approach in Firth (1993) and Kosmidis & Firth (2009), the median bias-reduction adjusted scores approach in Kenne Pagui et al. (2017), the correction of the asymptotic bias in Cordeiro & McCullagh (1991), the mixed bias-reduction adjusted scores approach in Kosmidis et al (2020), maximum penalized likelihood with powers of the Jeffreys prior as penalty, and maximum likelihood. Estimation is performed using a quasi Fisher scoring iteration (see ```vignette("iteration", "brglm2")```), which, in the case of mean-bias reduction, resembles an iterative correction of the asymptotic bias of the Fisher scoring iterates.

### Usage

```brglmFit(
x,
y,
weights = rep(1, nobs),
start = NULL,
etastart = NULL,
mustart = NULL,
offset = rep(0, nobs),
family = gaussian(),
control = list(),
intercept = TRUE,
fixed_totals = NULL,
singular.ok = TRUE
)

brglm_fit(
x,
y,
weights = rep(1, nobs),
start = NULL,
etastart = NULL,
mustart = NULL,
offset = rep(0, nobs),
family = gaussian(),
control = list(),
intercept = TRUE,
fixed_totals = NULL,
singular.ok = TRUE
)
```

### Arguments

 `x` `x` is a design matrix of dimension `n * p`. `y` `y` is a vector of observations of length `n`. `weights` an optional vector of ‘prior weights’ to be used in the fitting process. Should be `NULL` or a numeric vector. `start` starting values for the parameters in the linear predictor. If `NULL` (default) then the maximum likelihood estimates are calculated and used as starting values. `etastart` applied only when start is not `NULL`. Starting values for the linear predictor to be passed to `glm.fit` when computing starting values using maximum likelihood. `mustart` applied only when start is not `NULL`. Starting values for the vector of means to be passed to `glm.fit` when computing starting values using maximum likelihood. `offset` this can be used to specify an a priori known component to be included in the linear predictor during fitting. This should be `NULL` or a numeric vector of length equal to the number of cases. One or more `offset` terms can be included in the formula instead or as well, and if more than one is specified their sum is used. See `model.offset`. `family` a description of the error distribution and link function to be used in the model. For `glm` this can be a character string naming a family function, a family function or the result of a call to a family function. For `glm.fit` only the third option is supported. (See `family` for details of family functions.) `control` a list of parameters controlling the fitting process. See `brglmControl` for details. `intercept` logical. Should an intercept be included in the null model? `fixed_totals` effective only when `family` is `poisson`. Either `NULL` (no effect) or a vector that indicates which counts must be treated as a group. See Details for more information and `brmultinom`. `singular.ok` logical. If `FALSE`, a singular model is an error.

### Details

A detailed description of the supported adjustments and the quasi Fisher scoring iteration is given in the iteration vignette (see, `vignette("iteration", "brglm2")` or Kosmidis et al, 2020). A shorter description of the quasi Fisher scoring iteration is also given in one of the vignettes of the *enrichwith* R package (see, https://cran.r-project.org/package=enrichwith/vignettes/bias.html). Kosmidis and Firth (2010) describe a parallel quasi Newton-Raphson iteration with the same stationary point.

In the special case of generalized linear models for binomial, Poisson and multinomial responses, the adjusted score equations approach returns estimates with improved frequentist properties, that are also always finite, even in cases where the maximum likelihood estimates are infinite (e.g. complete and quasi-complete separation in multinomial regression). See, Kosmidis and Firth (2020) for a proof for binomial-response GLMs with Jeffreys-pior penalties to the log-likelihood, which is equivalent to mean bias reduction for logistic regression. See, also, `detect_separation` and `check_infinite_estimates` for pre-fit and post-fit methods for the detection of infinite estimates in binomial response generalized linear models.

The type of score adjustment to be used is specified through the `type` argument (see `brglmControl` for details). The available options are

• `type = "AS_mixed"`: the mixed bias-reducing score adjustments in Kosmidis et al (2020) that result in mean bias reduction for the regression parameters and median bias reduction for the dispersion parameter, if any; default.

• `type = "AS_mean"`: the mean bias-reducing score adjustments in Firth, 1993 and Kosmidis & Firth, 2009. `type = "AS_mixed"` and `type = "AS_mean"` will return the same results when `family` is `binomial` or `poisson`, i.e. when the dispersion is fixed)

• `type = "AS_median"`: the median-bias reducing score adjustments in Kenne Pagui et al. (2017)

• `type = "MPL_Jeffreys"`: maximum penalized likelihood with powers of the Jeffreys prior as penalty.

• `type = "ML"`: maximum likelihood

• `type = "correction"`: asymptotic bias correction, as in Cordeiro & McCullagh (1991).

The null deviance is evaluated based on the fitted values using the method specified by the `type` argument (see `brglmControl`).

The `family` argument of the current version of `brglmFit` can accept any combination of `family` objects and link functions, including families with user-specified link functions, `mis` links, and `power` links, but excluding `quasi`, `quasipoisson` and `quasibinomial` families.

The description of `method` argument and the ```Fitting functions``` section in `glm` gives information on supplying fitting methods to `glm`.

`fixed_totals` to specify groups of observations for which the sum of the means of a Poisson model will be held fixed to the observed count for each group. This argument is used internally in `brmultinom` and `bracl` for baseline-category logit models and adjacent category logit models, respectively.

`brglm_fit` is an alias to `brglmFit`.

### Author(s)

Ioannis Kosmidis [aut, cre] ioannis.kosmidis@warwick.ac.uk, Euloge Clovis Kenne Pagui [ctb] kenne@stat.unipd.it

### References

Kosmidis I, Firth D (2020). Jeffreys-prior penalty, finiteness and shrinkage in binomial-response generalized linear models. *Biometrika* doi: 10.1093/biomet/asaa052

Kosmidis I, Kenne Pagui E C, Sartori N (2020). Mean and median bias reduction in generalized linear models. *Statistics and Computing*, **30**, 43-59 doi: 10.1007/s11222-019-09860-6

Cordeiro G M, McCullagh P (1991). Bias correction in generalized linear models. *Journal of the Royal Statistical Society. Series B (Methodological)*, **53**, 629-643 doi: 10.1111/j.2517-6161.1991.tb01852.x

Firth D (1993). Bias reduction of maximum likelihood estimates. *Biometrika*. **80**, 27-38 doi: 10.2307/2336755

Kenne Pagui E C, Salvan A, Sartori N (2017). Median bias reduction of maximum likelihood estimates. *Biometrika*, **104**, 923–938 doi: 10.1093/biomet/asx046

Kosmidis I, Firth D (2009). Bias reduction in exponential family nonlinear models. *Biometrika*, **96**, 793-804 doi: 10.1093/biomet/asp055

Kosmidis I, Firth D (2010). A generic algorithm for reducing bias in parametric estimation. *Electronic Journal of Statistics*, **4**, 1097-1112 doi: 10.1214/10-EJS579

Kosmidis I (2014). Bias in parametric estimation: reduction and useful side-effects. *WIRE Computational Statistics*, **6**, 185-196 doi: 10.1002/wics.1296

### See Also

`brglmControl`, `glm.fit`, `glm`

### Examples

```## The lizards example from ?brglm::brglm
data("lizards")
# Fit the model using maximum likelihood
lizardsML <- glm(cbind(grahami, opalinus) ~ height + diameter +
light + time, family = binomial(logit), data = lizards,
method = "glm.fit")
# Mean bias-reduced fit:
lizardsBR_mean <- glm(cbind(grahami, opalinus) ~ height + diameter +
light + time, family = binomial(logit), data = lizards,
method = "brglmFit")
# Median bias-reduced fit:
lizardsBR_median <- glm(cbind(grahami, opalinus) ~ height + diameter +
light + time, family = binomial(logit), data = lizards,
method = "brglmFit", type = "AS_median")
summary(lizardsML)
summary(lizardsBR_median)
summary(lizardsBR_mean)

# Maximum penalized likelihood with Jeffreys prior penatly
lizards_Jeffreys <- glm(cbind(grahami, opalinus) ~ height + diameter +
light + time, family = binomial(logit), data = lizards,
method = "brglmFit", type = "MPL_Jeffreys")
# lizards_Jeffreys is the same fit as lizardsBR_mean (see Firth, 1993)
all.equal(coef(lizardsBR_mean), coef(lizards_Jeffreys))

# Maximum penalized likelihood with powers of the Jeffreys prior as
# penalty. See Kosmidis & Firth (2020) for the finiteness and
# shrinkage properties of the maximum penalized likelihood
# estimators in binomial response models

a <- seq(0, 20, 0.5)
coefs <- sapply(a, function(a) {
out <- glm(cbind(grahami, opalinus) ~ height + diameter +
light + time, family = binomial(logit), data = lizards,
method = "brglmFit", type = "MPL_Jeffreys", a = a)
coef(out)
})
# Illustration of shrinkage as a grows
matplot(a, t(coefs), type = "l", col = 1, lty = 1)
abline(0, 0, col = "grey")

## Another example from
## King, Gary, James E. Alt, Nancy Elizabeth Burns and Michael Laver
## (1990).  "A Unified Model of Cabinet Dissolution in Parliamentary
## Democracies", _American Journal of Political Science_, **34**, 846-870

data("coalition", package = "brglm2")
# The maximum likelihood fit with log link
coalitionML <- glm(duration ~ fract + numst2, family = Gamma, data = coalition)
# The mean bias-reduced fit
coalitionBR_mean <- update(coalitionML, method = "brglmFit")
# The bias-corrected fit
coalitionBC <- update(coalitionML, method = "brglmFit", type = "correction")
# The median bias-corrected fit
coalitionBR_median <- update(coalitionML, method = "brglmFit", type = "AS_median")

## An example with offsets from Venables & Ripley (2002, p.189)
data("anorexia", package = "MASS")

anorexML <- glm(Postwt ~ Prewt + Treat + offset(Prewt),
family = gaussian, data = anorexia)
anorexBC <- update(anorexML, method = "brglmFit", type = "correction")
anorexBR_mean <- update(anorexML, method = "brglmFit")
anorexBR_median <- update(anorexML, method = "brglmFit", type = "AS_median")

## All methods return the same estimates for the regression
## parameters because the maximum likelihood estimator is normally
## distributed around the `true` value under the model (hence, both
## mean and component-wise median unbiased). The Wald tests for
## anorexBC and anorexBR_mean differ from anorexML
## because the bias-reduced estimator of the dispersion is the
## unbiased, by degree of freedom adjustment (divide by n - p),
## estimator of the residual variance. The Wald tests from
## anorexBR_median are based on the median bias-reduced estimator
## of the dispersion that results from a different adjustment of the
## degrees of freedom (divide by n - p - 2/3)
summary(anorexML)
summary(anorexBC)
summary(anorexBR_mean)
summary(anorexBR_median)

## endometrial data from Heinze & Schemper (2002) (see ?endometrial)
data("endometrial", package = "brglm2")
endometrialML <- glm(HG ~ NV + PI + EH, data = endometrial,
family = binomial("probit"))
endometrialBR_mean <- update(endometrialML, method = "brglmFit",
type = "AS_mean")
endometrialBC <- update(endometrialML, method = "brglmFit",
type = "correction")
endometrialBR_median <- update(endometrialML, method = "brglmFit",
type = "AS_median")
summary(endometrialML)
summary(endometrialBC)
summary(endometrialBR_mean)
summary(endometrialBR_median)

```

[Package brglm2 version 0.7.1 Index]