stdGlm {stdReg} | R Documentation |
Regression standardization in generalized linear models
Description
stdGlm
performs regression standardization in generalized linear models,
at specified values of the exposure, over the sample covariate distribution.
Let ,
, and
be the outcome, the exposure, and a
vector of covariates, respectively.
stdGlm
uses a fitted generalized linear
model to estimate the standardized
mean , where
is a specific value of
,
and the outer expectation is over the marginal distribution of
.
Usage
stdGlm(fit, data, X, x, clusterid, case.control = FALSE, subsetnew)
Arguments
fit |
an object of class |
data |
a data frame containing the variables in the model. This should be the same
data frame as was used to fit the model in |
X |
a string containing the name of the exposure variable |
x |
an optional vector containing the specific values of |
clusterid |
an optional string containing the name of a cluster identification variable when data are clustered. |
case.control |
logical. Do data come from a case-control study? Defaults to FALSE. |
subsetnew |
an optional logical statement specifying a subset of observations to be used in the standardization. This set is assumed to be a subset of the subset (if any) that was used to fit the regression model. |
Details
stdGlm
assumes that a generalized linear model
has been fitted. The maximum likelihood estimate of is used to obtain
estimates of the mean
:
For each in the
x
argument, these estimates are averaged across
all subjects (i.e. all observed values of ) to produce estimates
where is the value of
for subject
,
.
The variance for
is obtained by the sandwich formula.
Value
An object of class "stdGlm"
is a list containing
call |
the matched call. |
input |
|
est |
a vector with length equal to |
vcov |
a square matrix with |
Note
The variance calculation performed by stdGlm
does not condition on
the observed covariates . To see how this matters, note that
The usual parameter in a generalized linear model does not depend
on
. Thus,
is
independent of
as well (since
), so that the
term
in the corresponding variance decomposition
for
becomes equal to 0. However,
depends
on
through the average over the sample distribution for
,
and thus the term
is not 0, unless one
conditions on
.
Author(s)
Arvid Sjolander.
References
Rothman K.J., Greenland S., Lash T.L. (2008). Modern Epidemiology, 3rd edition. Lippincott, Williams \& Wilkins.
Sjolander A. (2016). Regression standardization with the R-package stdReg. European Journal of Epidemiology 31(6), 563-574.
Sjolander A. (2016). Estimation of causal effect measures with the R-package stdReg. European Journal of Epidemiology 33(9), 847-858.
Examples
##Example 1: continuous outcome
n <- 1000
Z <- rnorm(n)
X <- rnorm(n, mean=Z)
Y <- rnorm(n, mean=X+Z+0.1*X^2)
dd <- data.frame(Z, X, Y)
fit <- glm(formula=Y~X+Z+I(X^2), data=dd)
fit.std <- stdGlm(fit=fit, data=dd, X="X", x=seq(-3,3,0.5))
print(summary(fit.std))
plot(fit.std)
##Example 2: binary outcome
n <- 1000
Z <- rnorm(n)
X <- rnorm(n, mean=Z)
Y <- rbinom(n, 1, prob=(1+exp(X+Z))^(-1))
dd <- data.frame(Z, X, Y)
fit <- glm(formula=Y~X+Z+X*Z, family="binomial", data=dd)
fit.std <- stdGlm(fit=fit, data=dd, X="X", x=seq(-3,3,0.5))
print(summary(fit.std))
plot(fit.std)