vglm {VGAM} | R Documentation |
Fitting Vector Generalized Linear Models
Description
vglm
fits vector generalized linear models (VGLMs).
This very large class of models includes
generalized linear models (GLMs) as a special case.
Usage
vglm(formula,
family = stop("argument 'family' needs to be assigned"),
data = list(), weights = NULL, subset = NULL,
na.action, etastart = NULL, mustart = NULL,
coefstart = NULL, control = vglm.control(...), offset = NULL,
method = "vglm.fit", model = FALSE, x.arg = TRUE, y.arg = TRUE,
contrasts = NULL, constraints = NULL, extra = list(),
form2 = NULL, qr.arg = TRUE, smart = TRUE, ...)
Arguments
formula |
a symbolic description of the model to be fit.
The RHS of the formula is applied to each linear
predictor.
The effect of different variables in each linear predictor
can be controlled by specifying constraint matrices—see
|
family |
a function of class |
data |
an optional data frame containing the variables in the model.
By default the variables are taken from
|
weights |
an optional vector or matrix of (prior fixed and known) weights
to be used in the fitting process.
If the VGAM family function handles multiple responses
( Currently the |
subset |
an optional logical vector specifying a subset of observations to be used in the fitting process. |
na.action |
a function which indicates what should happen when
the data contain |
etastart |
optional starting values for the linear predictors.
It is a |
mustart |
optional starting values for the fitted values.
It can be a vector or a matrix;
if a matrix, then it has the same number of rows
as the response.
Usually |
coefstart |
optional starting values for the coefficient vector.
The length and order must match that of |
control |
a list of parameters for controlling the fitting process.
See |
offset |
a vector or |
method |
the method to be used in fitting the model.
The default (and
presently only) method |
model |
a logical value indicating whether the
model frame
should be assigned in the |
x.arg , y.arg |
logical values indicating whether
the LM matrix and response vector/matrix used in the fitting
process should be assigned in the |
contrasts |
an optional list. See the |
constraints |
an optional If the Properties:
each constraint matrix must have As mentioned above, the labelling of each constraint matrix
must match exactly, e.g.,
|
extra |
an optional list with any extra information that might be needed by the VGAM family function. |
form2 |
the second (optional) formula.
If argument |
qr.arg |
logical value indicating whether the slot |
smart |
logical value indicating whether smart prediction
( |
... |
further arguments passed into |
Details
A vector generalized linear model (VGLM) is loosely defined
as a statistical model that is a function of M
linear
predictors and can be estimated by Fisher scoring.
The central formula is given by
\eta_j = \beta_j^T x
where x
is a vector of explanatory variables
(sometimes just a 1 for an intercept),
and
\beta_j
is a vector of regression coefficients
to be estimated.
Here, j=1,\ldots,M
, where M
is finite.
Then one can write
\eta=(\eta_1,\ldots,\eta_M)^T
as a vector of linear predictors.
Most users will find vglm
similar in flavour to
glm
.
The function vglm.fit
actually does the work.
Value
An object of class "vglm"
, which has the
following slots. Some of these may not be assigned to save
space, and will be recreated if necessary later.
extra |
the list |
family |
the family function (of class |
iter |
the number of IRLS iterations used. |
predictors |
a |
assign |
a named list which matches the columns and the (LM) model matrix terms. |
call |
the matched call. |
coefficients |
a named vector of coefficients. |
constraints |
a named list of constraint matrices used in the fitting. |
contrasts |
the contrasts used (if any). |
control |
list of control parameter used in the fitting. |
criterion |
list of convergence criterion evaluated at the final IRLS iteration. |
df.residual |
the residual degrees of freedom. |
df.total |
the total degrees of freedom. |
dispersion |
the scaling parameter. |
effects |
the effects. |
fitted.values |
the fitted values, as a matrix. This is often the mean but may be quantiles, or the location parameter, e.g., in the Cauchy model. |
misc |
a list to hold miscellaneous parameters. |
model |
the model frame. |
na.action |
a list holding information about missing values. |
offset |
if non-zero, a |
post |
a list where post-analysis results may be put. |
preplot |
used by |
prior.weights |
initially supplied weights
(the |
qr |
the QR decomposition used in the fitting. |
R |
the R matrix in the QR decomposition used in the fitting. |
rank |
numerical rank of the fitted model. |
residuals |
the working residuals at the final IRLS iteration. |
ResSS |
residual sum of squares at the final IRLS iteration with the adjusted dependent vectors and weight matrices. |
smart.prediction |
a list of data-dependent parameters (if any) that are used by smart prediction. |
terms |
the |
weights |
the working weight matrices at the final IRLS iteration. This is in matrix-band form. |
x |
the model matrix (linear model LM, not VGLM). |
xlevels |
the levels of the factors, if any, used in fitting. |
y |
the response, in matrix form. |
This slot information is repeated at vglm-class
.
WARNING
See warnings in vglm.control
.
Also, see warnings under weights
above regarding
sampling weights from complex sampling designs.
Note
This function can fit a wide variety of
statistical models. Some of
these are harder to fit than others because
of inherent numerical
difficulties associated with some of them.
Successful model fitting
benefits from cumulative experience.
Varying the values of arguments
in the VGAM family function itself
is a good first step if
difficulties arise, especially if initial
values can be inputted.
A second, more general step, is to vary the
values of arguments in
vglm.control
.
A third step is to make use of arguments such
as etastart
,
coefstart
and mustart
.
Some VGAM family functions end in "ff"
to avoid
interference with other functions, e.g.,
binomialff
,
poissonff
.
This is because VGAM family
functions are incompatible with glm
(and also gam()
in gam and
gam
in the mgcv library).
The smart prediction (smartpred
)
library is incorporated
within the VGAM library.
The theory behind the scaling parameter is currently being made more rigorous, but it it should give the same value as the scale parameter for GLMs.
In Example 5 below, the xij
argument to
illustrate covariates
that are specific to a linear predictor.
Here, lop
/rop
are
the ocular pressures of the left/right eye
(artificial data).
Variables leye
and reye
might be
the presence/absence of
a particular disease on the LHS/RHS eye respectively.
See
vglm.control
and
fill1
for more details and examples.
Author(s)
Thomas W. Yee
References
Yee, T. W. (2015). Vector Generalized Linear and Additive Models: With an Implementation in R. New York, USA: Springer.
Yee, T. W. and Hastie, T. J. (2003). Reduced-rank vector generalized linear models. Statistical Modelling, 3, 15–41.
Yee, T. W. and Wild, C. J. (1996). Vector generalized additive models. Journal of the Royal Statistical Society, Series B, Methodological, 58, 481–493.
Yee, T. W. (2014). Reduced-rank vector generalized linear models with two linear predictors. Computational Statistics and Data Analysis, 71, 889–902.
Yee, T. W. (2008).
The VGAM
Package.
R News, 8, 28–39.
See Also
vglm.control
,
vglm-class
,
vglmff-class
,
smartpred
,
vglm.fit
,
fill1
,
rrvglm
,
vgam
.
Methods functions include
add1.vglm
,
anova.vglm
,
AICvlm
,
coefvlm
,
confintvglm
,
constraints.vlm
,
drop1.vglm
,
fittedvlm
,
hatvaluesvlm
,
hdeff.vglm
,
Influence.vglm
,
linkfunvlm
,
lrt.stat.vlm
,
score.stat.vlm
,
wald.stat.vlm
,
nobs.vlm
,
npred.vlm
,
plotvglm
,
predictvglm
,
residualsvglm
,
step4vglm
,
summaryvglm
,
lrtest_vglm
,
update
,
TypicalVGAMfamilyFunction
,
etc.
Examples
# Example 1. See help(glm)
(d.AD <- data.frame(treatment = gl(3, 3),
outcome = gl(3, 1, 9),
counts = c(18,17,15,20,10,20,25,13,12)))
vglm.D93 <- vglm(counts ~ outcome + treatment, poissonff,
data = d.AD, trace = TRUE)
summary(vglm.D93)
# Example 2. Multinomial logit model
pneumo <- transform(pneumo, let = log(exposure.time))
vglm(cbind(normal, mild, severe) ~ let, multinomial, pneumo)
# Example 3. Proportional odds model
fit3 <- vglm(cbind(normal, mild, severe) ~ let, propodds, pneumo)
coef(fit3, matrix = TRUE)
constraints(fit3)
model.matrix(fit3, type = "lm") # LM model matrix
model.matrix(fit3) # Larger VGLM (or VLM) matrix
# Example 4. Bivariate logistic model
fit4 <- vglm(cbind(nBnW, nBW, BnW, BW) ~ age, binom2.or, coalminers)
coef(fit4, matrix = TRUE)
depvar(fit4) # Response are proportions
weights(fit4, type = "prior")
# Example 5. The use of the xij argument (simple case).
# The constraint matrix for 'op' has one column.
nn <- 1000
eyesdat <- round(data.frame(lop = runif(nn),
rop = runif(nn),
op = runif(nn)), digits = 2)
eyesdat <- transform(eyesdat, eta1 = -1 + 2 * lop,
eta2 = -1 + 2 * lop)
eyesdat <- transform(eyesdat,
leye = rbinom(nn, 1, prob = logitlink(eta1, inv = TRUE)),
reye = rbinom(nn, 1, prob = logitlink(eta2, inv = TRUE)))
head(eyesdat)
fit5 <- vglm(cbind(leye, reye) ~ op,
binom2.or(exchangeable = TRUE, zero = 3),
data = eyesdat, trace = TRUE,
xij = list(op ~ lop + rop + fill1(lop)),
form2 = ~ op + lop + rop + fill1(lop))
coef(fit5)
coef(fit5, matrix = TRUE)
constraints(fit5)
fit5@control$xij
head(model.matrix(fit5))
# Example 6. The use of the 'constraints' argument.
as.character(~ bs(year,df=3)) # Get the white spaces right
clist <- list("(Intercept)" = diag(3),
"bs(year, df = 3)" = rbind(1, 0, 0))
fit1 <- vglm(r1 ~ bs(year,df=3), gev(zero = NULL),
data = venice, constraints = clist, trace = TRUE)
coef(fit1, matrix = TRUE) # Check