cond.glm {cond} | R Documentation |
Approximate Conditional Inference for Logistic and Loglinear Models
Description
Performs approximate conditional inference on a scalar parameter of
interest in logistic and loglinear models. The output is stored in
an object of class cond
.
Usage
## S3 method for class 'glm'
cond(object, offset, formula = NULL, family = NULL,
data = sys.frame(sys.parent()), pts = 20,
n = max(100, 2*pts), tms = 0.6, from = NULL, to = NULL,
control = glm.control(...), trace = FALSE, ...)
Arguments
object |
a |
offset |
the covariate occurring in the model formula whose coefficient
represents the parameter of interest. May be numerical or a
two-level factor. In case of a two-level factor, it must be
coded by contrasts and not appear as two dummy variables in the
model. Can also be a call to a mathematical function (such as
|
formula |
a formula expression (only if no |
family |
a |
data |
an optional data frame in which to interpret the variables
occurring in the formula (only if no |
pts |
number of output points (minimum 10) that are calculated exactly. The default is 20. |
n |
approximate number of output points (minimum 50) produced by the
spline interpolation. The default is the maximum between 100 and
twice |
tms |
defines the range MLE +/- |
from |
starting value of the sequence that contains the values of the parameter of interest for which output points are calculated exactly. The default is MLE - 3.5 * s.e. |
to |
ending value of the sequence that contains the values of the parameter of interest for which output points are calculated exactly. The default is MLE + 3.5 * s.e. |
control |
a list of iteration and algorithmic constants that controls the
GLM fit. See \ |
trace |
if |
... |
additional arguments, such as |
Details
This function is a method for the generic function cond
for
class glm
. It can be invoked by calling cond
for an object of the appropriate class, or directly by calling
cond.glm
regardless of the class of the object.
cond.glm
has also to be used if the glm
object is not
provided throught the object
argument but specified by
formula
and family
.
The function cond.glm
implements several small sample
asymptotic methods for approximate conditional inference in
logistic and loglinear models. Approximations for both the
conditional log likelihood function and conditional tail area
probabilities are available (see cond.object
for
details). Attention is restricted to a scalar parameter of
interest. The associated covariate can be either numerical or
a two-level factor.
Approximate conditional inference is performed by either updating a
fitted generalized linear model or defining the model formula and
family. All approximations are calculated exactly for pts
equally spaced points ranging from from
to to
. A
cubic spline interpolation is used to extend them over the whole
interval of interest, except for the range of values defined by
MLE +/- tms
* s.e. where the spline
interpolation is replaced by a higher order polynomial
interpolation. This is done in order to avoid numerical
instabilities which are likely to occur for values of the parameter
of interest close to the MLE. Results are stored in an
object of class cond
. Method functions like print
,
summary
and plot
can be used to examine the output or
represent it graphically. Components can be extracted using
coef
, formula
and family
.
Main references for the methods considered are the papers by Pierce and Peters (1992) and Davison (1988). More details on the implementation are given in Brazzale (1999, 2000).
Value
The returned value is an object of class cond
; see
cond.object
for details.
Note
In rare occasions, cond.glm
dumps because of non-convergence
of the function glm
which is used to refit the model for a
fixed value of the parameter of interest. This happens for instance
if this value is too extreme. The arguments from
and
to
may then be used to limit the default range of
MLE +/- 3.5 * s.e. A further possibility is to
fine-tuning the constants (number of iterations, convergence
threshold) that control the GLM fit through the
control
argument.
cond.glm
may also dump if the estimate of the parameter of
interest is large (tipically > 400) in absolute value. This may be
avoided by reparametrizing the model.
The output of cond.glm
may be unreliable if part of the data
have a degenerate distribution. For example take the fungal
infections treatment data contained in the fungal
data
frame. Of the five 2\times 2
contingency tables, two
(the first and the third) are degenerate. As they make no
contribution to the exact conditional likelihood, they should be
omitted from the approximate conditional fit.
References
Brazzale, A. R. (1999) Approximate conditional inference for logistic and loglinear models. J. Comput. Graph. Statist., 8, 1999, 653–661.
Brazzale, A. R. (2000) Practical Small-Sample Parametric Inference. Ph.D. Thesis N. 2230, Department of Mathematics, Swiss Federal Institute of Technology Lausanne.
Davison, A. C. (1988) Approximate conditional inference in generalized linear models. J. R. Statist. Soc. B, 50, 445–461.
Pierce, D. A. and Peters, D. (1992) Practical use of higher order asymptotics for multiparameter exponential families (with Discussion). J. R. Statist. Soc. B, 54, 701–737.
See Also
cond.object
, summary.cond
,
plot.cond
, glm
Examples
## Crying Babies Data
data(babies)
babies.glm <- glm(formula = cbind(r1, r2) ~ day + lull - 1,
family = binomial, data = babies)
babies.cond <- cond(object = babies.glm, offset = lullyes)
babies.cond
##
## If one wishes to avoid the generalized linear model fit:
babies.cond <- cond.glm(formula = cbind(r1, r2) ~ day + lull - 1,
family = binomial, data = babies, offset = lullyes)
babies.cond
## Urine Data
## (function call as offset variable)
data(urine)
urine.glm <- glm(r ~ gravity + ph + osmo + conduct + urea + log(calc),
family = binomial, data = urine)
labels(coef(urine.glm))
urine.cond <- cond(urine.glm, log(calc))
##
## (large estimate of regression coefficient)
urine.glm <- glm(r ~ gravity + ph + osmo + conduct + urea + calc,
family = binomial, data = urine)
coef(urine.glm)
urine.glm <- glm(r ~ I(gravity * 100) + ph + osmo + conduct + urea + calc,
family = binomial, data = urine)
coef(urine.glm)
urine.cond <- cond(urine.glm, I(gravity * 100))
## Fungal Infections Treatment Data (numerical instabilities around the
## MLE)
## (full data analysis)
data(fungal)
fungal.glm <- glm(cbind(success, failure) ~ center + group - 1,
family = binomial, data = fungal,
control = glm.control(maxit = 50, epsilon = 1e-005))
fungal.cond <- cond(fungal.glm, groupT)
plot(fungal.cond, which = 2)
## (partial data analysis)
fungal.glm <- glm(cbind(success, failure) ~ center + group - 1,
family = binomial, data = fungal, subset = -c(1,2,5,6),
control = glm.control(maxit = 50, epsilon = 1e-005))
fungal.cond <- cond(fungal.glm, groupT)
plot(fungal.cond, which = 2)
## (Tables 1 and 3 are omitted).