glm_weightit {WeightIt} | R Documentation |
Fitting Weighted Generalized Linear Models
Description
lm_weightit()
and glm_weightit()
are used to fit (generalized) linear models with a variance matrix that accounts for estimation of weights, if supplied. By default, these functions use M-estimation to construct a robust covariance matrix using the estimation equations for the weighting model and the outcome model. lm_weightit()
is a wrapper for glm_weightit()
with the Gaussian family and identity link (i.e., a linear model). coxph_weightit()
fits a Cox proportional hazards model accounting for the weights and is a wrapper for survival::coxph()
.
Usage
glm_weightit(
formula,
data,
family = gaussian,
weightit,
vcov = NULL,
cluster,
R = 500,
offset,
start = NULL,
etastart,
mustart,
control = list(...),
x = FALSE,
y = TRUE,
contrasts = NULL,
fwb.args = list(),
...
)
coxph_weightit(
formula,
data,
weightit,
vcov = NULL,
cluster,
R = 500,
x = FALSE,
y = TRUE,
fwb.args = list(),
...
)
lm_weightit(
formula,
data,
weightit,
vcov = NULL,
cluster,
R = 500,
offset,
start = NULL,
etastart,
mustart,
control = list(...),
x = FALSE,
y = TRUE,
contrasts = NULL,
...
)
## S3 method for class 'glm_weightit'
summary(object, ci = FALSE, level = 0.95, transform = NULL, ...)
Arguments
formula |
an object of class "formula" (or one that can be coerced to that class): a symbolic description of the model to be fitted. For |
data |
a data frame containing the variables in the model. If not found in data, the variables are taken from |
family |
a description of the error distribution and link function to be used in the model. This can be a character string naming a family function, a family function or the result of a call to a family function. See family for details of family functions. Can also be the string |
weightit |
a |
vcov |
string; the method used to compute the variance of the estimated parameters. Allowable options include |
cluster |
optional; for computing a cluster-robust variance matrix, a variable indicating the clustering of observations, a list (or data frame) thereof, or a one-sided formula specifying which variable(s) from the fitted model should be used. Note the cluster-robust variance matrix uses a correction for small samples, as is done in |
R |
the number of bootstrap replications when |
offset |
optional; a numeric vector contain the model offset. See |
start |
optional starting values for the coefficients. |
etastart , mustart |
optional starting values for the linear predictor and vector of means when |
control |
a list of parameters for controlling the fitting process. |
x , y |
logical values indicating whether the response vector and model matrix used in the fitting process should be returned as components of the returned value. |
contrasts |
an optional list define contrasts for factor variables. See |
fwb.args |
an optional list of further arguments to supply to |
... |
for |
object |
a |
ci |
|
level |
when |
transform |
the function used to transform the coefficients, e.g., |
Details
glm_weightit()
is essentially a wrapper for glm()
that optionally computes a coefficient variance matrix that can be adjusted to account for estimation of the weights if a weightit
or weightitMSM
object is supplied to the weightit
argument. When no argument is supplied to weightit
or there is no "Mparts"
attribute in the supplied object, the default variance matrix returned will be the "HC0" sandwich variance matrix, which is robust to misspecification of the outcome family (including heteroscedasticity). Otherwise, the default variance matrix uses M-estimation to additionally adjust for estimation of the weights. When possible, this often yields smaller (and more accurate) standard errors. See the individual methods pages to see whether and when an "Mparts"
attribute is included in the supplied object. To request that a variance matrix be computed that doesn't account for estimation of the weights even when a compatible weightit
object is supplied, set vcov = "HC0"
, which treats the weights as fixed.
Bootstrapping can also be used to compute the coefficient variance matrix; when vcov = "BS"
or vcov = "FWB"
, which implement the traditional resampling-based and fractional weighted bootstrap, respectively, the entire process of estimating the weights and fitting the outcome model is repeated in bootstrap samples (if a weightit
object is supplied). This accounts for estimation of the weights and can be used with any weighting method. It is important to set a seed using set.seed()
to ensure replicability of the results. The fractional weighted bootstrap is more reliable but requires the weighting method to accept sampling weights (which most do, and you'll get an error if it doesn't). Setting vcov = "FWB"
and supplying fwb.args = list(wtype = "multinom")
also performs the resampling-based bootstrap but with the additional features fwb provides (e.g., a progress and parallelization) at the expense of needing to have fwb installed.
When family = "multinomial"
, multinomial logistic regression is fit using a custom function in WeightIt that uses M-estimation to estimate the model coefficients. This implementation is less robust to failures than other multinomial logistic regression solvers and should be used with caution. Estimation of coefficients should align with that from mlogit::mlogit()
and mclogit::mblogit()
.
Functions in the sandwich package can be to compute standard errors after fitting, regardless of how vcov
was specified, though these will ignore estimation of the weights, if any. When no adjustment is done for estimation of the weights (i.e., because no weightit
argument was supplied or there was no "Mparts"
component in the supplied object), the default variance matrix produced by glm_weightit()
should align with that from sandwich::vcovHC(. type = "HC0")
or sandwich::vcovCL(., type = "HC0", cluster = cluster)
when cluster
is supplied.
coxph_weightit()
is a wrapper for survival::coxph()
to fit weighted survival models, optionally accounting for estimation of the weights. It differs from coxph()
in a few ways: the print()
and summary()
methods are more like those for glm
objects then for coxph
objects, and the cluster
argument should be specified as a one-sided formula (which can include multiple clustering variables) and uses a small sample correction for cluster variance estimates when specified. Currently, M-estimation is not supported, so bootstrapping (i.e., vcov = "BS"
or "FWB"
) is the only way to correctly adjust for estimation of the weights.
Value
For lm_weightit()
and glm_weightit()
, a glm_weightit
object, which inherits from glm
. Unless vcov = "none"
, the vcov
component contains the covariance matrix adjusted for the estimation of the weights if requested and a compatible weightit
object was supplied. The vcov_type
component contains the type of variance matrix requested. If cluster
is supplied, it will be stored in the "cluster"
attribute of the output object, even if not used. For coxph_weightit()
, a coxph_weightit
object, which inherits from glm_weightit
and coxph
. See survival::coxph()
for details.
print()
, vcov()
, predict()
, and confint()
methods are also available; these generally follow the same pattern as the respect method for glm
objects. confint()
uses Wald confidence intervals (internally calling confint.lm()
). When family = "multinomial"
, predict() produces a matrix of predicted probabilities, one for each level of the outcome, and the type
argument is ignored. model.frame()
output (also the model
component of the output object) will include two extra column when weightit
is supplied: (weights)
containing the weights used in the model (the product of the estimated weights and the sampling weights, if any) and (s.weights)
containing the sampling weights, which will be 1 if s.weights
is not supplied in the original weightit()
call.
See Also
lm()
and glm()
for fitting generalized linear models without adjusting standard errors for estimation of the weights. survival::coxph()
for fitting Cox proportional hazards models without adjust standard errors for estimation of the weights.
Examples
data("lalonde", package = "cobalt")
# Logistic regression ATT weights
w.out <- weightit(treat ~ age + educ + married + re74,
data = lalonde, method = "glm",
estimand = "ATT")
# Linear regression outcome model that adjusts
# for estimation of weights
fit1 <- lm_weightit(re78 ~ treat, data = lalonde,
weightit = w.out)
summary(fit1)
# Linear regression outcome model that treats weights
# as fixed
fit2 <- lm_weightit(re78 ~ treat, data = lalonde,
weightit = w.out, vcov = "HC0")
summary(fit2)
# Linear regression outcome model that bootstraps
# estimation of weights and outcome model fitting
# using fractional weighted bootstrap with "Mammen"
# weights
set.seed(123)
fit3 <- lm_weightit(re78 ~ treat, data = lalonde,
weightit = w.out,
vcov = "FWB",
R = 50,
fwb.args = list(wtype = "mammen"))
summary(fit3)
# Multinomial logistic regression outcome model
# that adjusts for estimation of weights
lalonde$re78_3 <- factor(findInterval(lalonde$re78,
c(0, 5e3, 1e4)))
fit4 <- glm_weightit(re78_3 ~ treat, data = lalonde,
weightit = w.out,
family = "multinomial")
summary(fit4)