ipf {cat}R Documentation

Iterative Proportional Fitting

Description

ML estimation for hierarchical loglinear models via conventional iterative proportional fitting (IPF).

Usage

ipf(table, margins, start, eps=0.0001, maxits=50, showits=TRUE)

Arguments

table

contingency table (array) to be fit by a log-linear model. All elements must be non-negative.

margins

vector describing the marginal totals to be fitted. A margin is described by the factors not summed over, and margins are separated by zeros. Thus c(1,2,0,2,3,0,1,3) would indicate fitting the (1,2), (2,3), and (1,3) margins in a three-way table, i.e., the model of no three-way association.

start

starting value for IPF algorithm. The default is a uniform table. If structural zeros appear in table, start should contain zeros in those cells and ones elsewhere.

eps

convergence criterion. This is the largest proportional change in an expected cell count from one iteration to the next. Any expected cell count that drops below 1E-07 times the average cell probability (1/number of non-structural zero cells) is set to zero during the iterations.

maxits

maximum number of iterations performed. The algorithm will stop if the parameter still has not converged after this many iterations.

showits

if TRUE, reports the iterations of IPF so the user can monitor the progress of the algorithm.

Value

array like table, but containing fitted values (expected frequencies) under the loglinear model.

DETAILS

This function is usually used to compute ML estimates for a loglinear model. For ML estimates, the array table should contain the observed frequencies from a cross-classified contingency table. Because this is the "cell-means" version of IPF, the resulting fitted values will add up to equals sum(table). To obtain estimated cell probabilities, rescale the fitted values to sum to one.

This function may also be used to compute the posterior mode of the multinomial cell probabilities under a Dirichlet prior. For a posterior mode, set the elements of table to (observed frequencies + Dirichlet hyperparameters - 1). Then, after running IPF, rescale the fitted values to sum to one.

Note

This function is essentially the same as the old S function loglin, but results are computed to double precision. See help(loglin) for more details.

References

Agresti, A. (1990) Categorical Data Analysis. J. Wiley & Sons, New York.

Schafer (1996) Analysis of Incomplete Multivariate Data. Chapman & Hall, Chapter 8.

See Also

ecm.cat, bipf

Examples

data(HairEyeColor)                     # load data
m=c(1,2,0,1,3,0,2,3)                   # no three-way interaction
fit1 <- ipf(HairEyeColor,margins=m,
            showits=TRUE)              # fit model
X2 <- sum((HairEyeColor-fit1)^2/fit1)  # Pearson chi square statistic
df <- prod(dim(HairEyeColor)-1)        # Degrees of freedom for this example
1 - pchisq(X2,df)                      # p-value for fit1

[Package cat version 0.0-9 Index]