fwdglm {forward}R Documentation

Forward Search in Generalized Linear Models

Description

This function applies the forward search approach to robust analysis in generalized linear models.

Usage

fwdglm(formula, family, data, weights, na.action, contrasts = NULL, bsb = NULL, 
       balanced = TRUE, maxit = 50, epsilon = 1e-06, nsamp = 100, trace = TRUE)

Arguments

formula

a symbolic description of the model to be fit. The details of the model are the same as for glm.

family

a description of the error distribution and link function to be used in the model. See family for details.

data

an optional data frame containing the variables in the model. By default the variables are taken from the environment from which the function is called.

weights

an optional vector of weights to be used in the fitting process.

na.action

a function which indicates what should happen when the data contain NA's. The default is set by the na.action setting of options, and is na.fail if that is unset. The default is na.omit.

contrasts

an optional list. See the contrasts.arg of model.matrix.default.

bsb

an optional vector specifying a starting subset of observations to be used in the forward search. By default the "best" starting subset is chosen using the function lmsglm with control arguments provided by nsamp.

balanced

logical, for a binary response if TRUE the proportion of successes on the full dataset is approximately balanced during the forward search algorithm.

maxit

integer giving the maximal number of IWLS iterations. See glm.control for details.

epsilon

positive convergence tolerance epsilon. See glm.control for details.

nsamp

the initial subset for the forward search in generalized linear models is found by the function lmsglm. This argument allows to control how many subsets are used in the robust fitting procedure. The choices are: the number of samples (100 by the default) or "all". Note that the algorithm tries to find nsamp good subsets or a maximum of 2*nsamp subsets.

trace

logical, if TRUE a message is printed for every ten iterations completed during the forward search.

Value

The function returns an object of class "fwdglm" with the following components:

call

the matched call.

Residuals

a (n x (n-p+1)) matrix of residuals.

Unit

a matrix of units added (to a maximum of 5 units) at each step.

included

a list with each element containing a vector of units included at each step of the forward search.

Coefficients

a ((n-p+1) x p) matrix of coefficients.

tStatistics

a ((n-p+1) x p) matrix of t statistics for the coefficients, i.e. coef.est/SE(coef.est).

Leverage

a (n x (n-p+1)) matrix of leverage values.

MaxRes

a ((n-p) x 2) matrix of max deviance residuals in the best subsets and m-th deviance residuals.

MinDelRes

a ((n-p-1) x 2) matrix of minimum deviance residuals out of best subsets and (m+1)-th deviance residuals.

ScoreTest

a ((n-p) x 1) matrix of score test statistics for a goodness of link test.

Likelihood

a ((n-p) x 4) matrix with columns containing: deviance, residual deviance, psuedo R^2 (computed as 1-deviance/null.deviance), dispersion parameter (computed as \sum(pearson.residuals^2)/(m - p)).

CookDist

a ((n-p) x 1) matrix of forward Cook's distances.

ModCookDist

a ((n-p) x 5) matrix of forward modified Cook's distances for the units (to a maximum of 5 units) included at each step.

Weights

a (n x (n-p)) matrix of weights used at each step of the forward search.

inibsb

a vector giving the best starting subset chosen by lmsglm.

binary.response

logical, equal to TRUE if binary response.

Author(s)

Originally written for S-Plus by: Kjell Konis kkonis@insightful.com and Marco Riani mriani@unipr.it
Ported to R by Luca Scrucca luca@stat.unipg.it

References

Atkinson, A.C. and Riani, M. (2000), Robust Diagnostic Regression Analysis, First Edition. New York: Springer, Chapter 6.

See Also

summary.fwdglm, plot.fwdglm, fwdlm, fwdsco.

Examples

 
data(cellular)
cellular$TNF <- as.factor(cellular$TNF)
cellular$IFN <- as.factor(cellular$IFN)
mod <- fwdglm(y ~ TNF + IFN, data=cellular, family=poisson(log), nsamp=200)
summary(mod)
## Not run: plot(mod)
plot(mod, 1)
plot(mod, 5)
plot(mod, 6, ylim=c(-3, 20))
plot(mod, 7)
plot(mod, 8)

[Package forward version 1.0.6 Index]