AFglm {AF} | R Documentation |

## Attributable fraction estimation based on a logistic regression model from a `glm`

object (commonly used for cross-sectional or case-control sampling designs).

### Description

`AFglm`

estimates the model-based adjusted attributable fraction for data from a logistic regression model in the form of a `glm`

object. This model is commonly used for data from a cross-sectional or non-matched case-control sampling design.

### Usage

```
AFglm(object, data, exposure, clusterid, case.control = FALSE)
```

### Arguments

`object` |
a fitted logistic regression model object of class " |

`data` |
an optional data frame, list or environment (or object coercible by |

`exposure` |
the name of the exposure variable as a string. The exposure must be binary (0/1) where unexposed is coded as 0. |

`clusterid` |
the name of the cluster identifier variable as a string, if data are clustered. Cluster robust standard errors will be calculated. |

`case.control` |
can be set to |

### Details

`AFglm`

estimates the attributable fraction for a binary outcome `Y`

under the hypothetical scenario where a binary exposure `X`

is eliminated from the population.
The estimate is adjusted for confounders `Z`

by logistic regression using the (`glm`

) function.
The estimation strategy is different for cross-sectional and case-control sampling designs even if the underlying logististic regression model is the same.
For cross-sectional sampling designs the AF can be defined as

`AF=1-\frac{Pr(Y_0=1)}{Pr(Y=1)}`

where `Pr(Y_0=1)`

denotes the counterfactual probability of the outcome if
the exposure would have been eliminated from the population and `Pr(Y = 1)`

denotes the factual probability of the outcome.
If `Z`

is sufficient for confounding control, then `Pr(Y_0=1)`

can be expressed as
`E_Z\{Pr(Y=1\mid{X=0,Z})\}.`

The function uses logistic regression to estimate `Pr(Y=1\mid{X=0,Z})`

, and the marginal sample distribution of `Z`

to approximate the outer expectation (Sjölander and Vansteelandt, 2012).
For case-control sampling designs the outcome prevalence is fixed by sampling design and absolute probabilities (`P.est`

and `P0.est`

) can not be estimated.
Instead adjusted log odds ratios (`log.or`

) are estimated for each individual.
This is done by setting `case.control`

to `TRUE`

. It is then assumed that the outcome is rare so that the risk ratio can be approximated by the odds ratio.
For case-control sampling designs the AF be defined as (Bruzzi et. al)

`AF = 1 - \frac{Pr(Y_0=1)}{Pr(Y = 1)}`

where `Pr(Y_0=1)`

denotes the counterfactual probability of the outcome if
the exposure would have been eliminated from the population. If `Z`

is sufficient for confounding control then the probability `Pr(Y_0=1)`

can be expressed as

`Pr(Y_0=1)=E_Z\{Pr(Y=1\mid{X}=0,Z)\}.`

Using Bayes' theorem this implies that the AF can be expressed as

`AF = 1-\frac{E_Z\{Pr(Y=1\mid X=0,Z)\}}{Pr(Y=1)}=1-E_Z\{RR^{-X}(Z)\mid{Y = 1}\}`

where `RR(Z)`

is the risk ratio

`\frac{Pr(Y=1\mid{X=1,Z})}{Pr(Y=1\mid{X=0,Z})}.`

Moreover, the risk ratio can be approximated by the odds ratio if the outcome is rare. Thus,

` AF \approx 1 - E_Z\{OR^{-X}(Z)\mid{Y = 1}\}.`

If `clusterid`

is supplied, then a clustered sandwich formula is used in all variance calculations.

### Value

`AF.est` |
estimated attributable fraction. |

`AF.var` |
estimated variance of |

`P.est` |
estimated factual proportion of cases; |

`P.var` |
estimated variance of |

`P0.est` |
estimated counterfactual proportion of cases if exposure would be eliminated; |

`P0.var` |
estimated variance of |

`log.or` |
a vector of the estimated log odds ratio for every individual.
then
then |

### Author(s)

Elisabeth Dahlqwist, Arvid Sjölander

### References

Bruzzi, P., Green, S. B., Byar, D., Brinton, L. A., and Schairer, C. (1985). Estimating the population attributable risk for multiple risk factors using case-control data. *American Journal of Epidemiology* **122**, 904-914.

Greenland, S. and Drescher, K. (1993). Maximum Likelihood Estimation of the Attributable Fraction from logistic Models. *Biometrics* **49**, 865-872.

Sjölander, A. and Vansteelandt, S. (2011). Doubly robust estimation of attributable fractions. *Biostatistics* **12**, 112-121.

### See Also

`glm`

used for fitting the logistic regression model. For conditional logistic regression (commonly for data from a matched case-control sampling design) see `AFclogit`

.

### Examples

```
# Simulate a cross-sectional sample
expit <- function(x) 1 / (1 + exp( - x))
n <- 1000
Z <- rnorm(n = n)
X <- rbinom(n = n, size = 1, prob = expit(Z))
Y <- rbinom(n = n, size = 1, prob = expit(Z + X))
# Example 1: non clustered data from a cross-sectional sampling design
data <- data.frame(Y, X, Z)
# Fit a glm object
fit <- glm(formula = Y ~ X + Z + X * Z, family = binomial, data = data)
# Estimate the attributable fraction from the fitted logistic regression
AFglm_est <- AFglm(object = fit, data = data, exposure = "X")
summary(AFglm_est)
# Example 2: clustered data from a cross-sectional sampling design
# Duplicate observations in order to create clustered data
id <- rep(1:n, 2)
data <- data.frame(id = id, Y = c(Y, Y), X = c(X, X), Z = c(Z, Z))
# Fit a glm object
fit <- glm(formula = Y ~ X + Z + X * Z, family = binomial, data = data)
# Estimate the attributable fraction from the fitted logistic regression
AFglm_clust <- AFglm(object = fit, data = data,
exposure = "X", clusterid = "id")
summary(AFglm_clust)
# Example 3: non matched case-control
# Simulate a sample from a non matched case-control sampling design
# Make the outcome a rare event by setting the intercept to -6
expit <- function(x) 1 / (1 + exp( - x))
NN <- 1000000
n <- 500
intercept <- -6
Z <- rnorm(n = NN)
X <- rbinom(n = NN, size = 1, prob = expit(Z))
Y <- rbinom(n = NN, size = 1, prob = expit(intercept + X + Z))
population <- data.frame(Z, X, Y)
Case <- which(population$Y == 1)
Control <- which(population$Y == 0)
# Sample cases and controls from the population
case <- sample(Case, n)
control <- sample(Control, n)
data <- population[c(case, control), ]
# Fit a glm object
fit <- glm(formula = Y ~ X + Z + X * Z, family = binomial, data = data)
# Estimate the attributable fraction from the fitted logistic regression
AFglm_est_cc <- AFglm(object = fit, data = data, exposure = "X", case.control = TRUE)
summary(AFglm_est_cc)
```

*AF*version 0.1.5 Index]