AFclogit {AF}R Documentation

Attributable fraction estimation based on a conditional logistic regression model as a clogit object (commonly used for matched case-control sampling designs).


AFclogit estimates the model-based adjusted attributable fraction from a conditional logistic regression model in form of a clogit object. This model is model is commonly used for data from matched case-control sampling designs.


AFclogit(object, data, exposure, clusterid)



a fitted conditional logistic regression model object of class "clogit".


an optional data frame, list or environment (or object coercible by to a data frame) containing the variables in the model. If not found in data, the variables are taken from environment (formula), typically the environment from which the function is called.


the name of the exposure variable as a string. The exposure must be binary (0/1) where unexposed is coded as 0.


the name of the cluster identifier variable as a string. Because conditional logistic regression is only used for clustered data, this argument must be supplied.


AFclogit estimates the attributable fraction for a binary outcome Y under the hypothetical scenario where a binary exposure X is eliminated from the population. The estimate is adjusted for confounders Z by conditional logistic regression. The estimation assumes that the outcome is rare so that the risk ratio can be approximated by the odds ratio, for details see Bruzzi et. al. Let the AF be defined as

AF = 1 - Pr(Y0 = 1) / Pr(Y = 1)

where Pr(Y0 = 1) denotes the counterfactual probability of the outcome if the exposure would have been eliminated from the population. If Z is sufficient for confounding control then the probability Pr(Y0 = 1) can be expressed as

Pr(Y0=1) = E_z{Pr(Y = 1 | X = 0, Z)}.

Using Bayes' theorem this implies that the AF can be expressed as

AF = 1 - E_z{Pr( Y = 1 | X = 0, Z)} / Pr(Y = 1) = 1 - E_z{RR^{-X} (Z) | Y = 1}

where RR(Z) is the risk ratio

Pr(Y = 1 | X = 1,Z)/Pr(Y=1 | X = 0, Z).

Moreover, the risk ratio can be approximated by the odds ratio if the outcome is rare. Thus,

AF is approximately 1 - E_z{OR^{-X}(Z) | Y = 1}.

The odds ratio is estimated by conditional logistic regression. The function gee in the drgee package is used to get the score contributions for each cluster and the hessian. A clustered sandwich formula is used in the variance calculation.



estimated attributable fraction.


estimated variance of AF.est. The variance is obtained by combining the delta methods with the sandwich formula.


a vector of the estimated log odds ratio for every individual. log.or contains the estimated coefficient for the exposure variable X for every level of the confounder Z as specified by the user in the formula. If the model to be estimated is

logit {Pr(Y=1|X,Z)} = α + β X + γ Z

then log.or is the estimate of β. If the model to be estimated is

logit{Pr(Y=1|X,Z)} = α + β X +γ Z +ψ XZ

then log.odds is the estimate of β + ψ Z.


Elisabeth Dahlqwist, Arvid Sjölander


Bruzzi, P., Green, S. B., Byar, D., Brinton, L. A., and Schairer, C. (1985). Estimating the population attributable risk for multiple risk factors using case-control data. American Journal of Epidemiology 122, 904-914.

See Also

clogit used for fitting the conditional logistic regression model for matched case-control designs. For non-matched case-control designs see AFglm.


expit <- function(x) 1 / (1 + exp( - x))
NN <- 1000000
n <- 500

# Example 1: matched case-control
# Duplicate observations in order to create a matched data sample
# Create an unobserved confounder U common for each pair of individuals
intercept <- -6
U  <- rnorm(n = NN)
Z1 <- rnorm(n = NN)
Z2 <- rnorm(n = NN)
X1 <- rbinom(n = NN, size = 1, prob = expit(U + Z1))
X2 <- rbinom(n = NN, size = 1, prob = expit(U + Z2))
Y1 <- rbinom(n = NN, size = 1, prob = expit(intercept + U + Z1 + X1))
Y2 <- rbinom(n = NN, size = 1, prob = expit(intercept + U + Z2 + X2))
# Select discordant pairs
discordant <- which(Y1!=Y2)
id <- rep(1:n, 2)
# Sample from discordant pairs
incl <- sample(x = discordant, size = n, replace = TRUE)
data <- data.frame(id = id, Y = c(Y1[incl], Y2[incl]), X = c(X1[incl], X2[incl]),
                   Z = c(Z1[incl], Z2[incl]))

# Fit a clogit object
fit <- clogit(formula = Y ~ X + Z + X * Z + strata(id), data = data)

# Estimate the attributable fraction from the fitted conditional logistic regression
AFclogit_est <- AFclogit(fit, data, exposure = "X", clusterid="id")

[Package AF version 0.1.5 Index]