ipf {cat} | R Documentation |
Iterative Proportional Fitting
Description
ML estimation for hierarchical loglinear models via conventional iterative proportional fitting (IPF).
Usage
ipf(table, margins, start, eps=0.0001, maxits=50, showits=TRUE)
Arguments
table |
contingency table (array) to be fit by a log-linear model. All elements must be non-negative. |
margins |
vector describing the marginal totals to be fitted. A margin is described by the factors not summed over, and margins are separated by zeros. Thus c(1,2,0,2,3,0,1,3) would indicate fitting the (1,2), (2,3), and (1,3) margins in a three-way table, i.e., the model of no three-way association. |
start |
starting value for IPF algorithm. The default is a uniform table.
If structural zeros appear in |
eps |
convergence criterion. This is the largest proportional change in an expected cell count from one iteration to the next. Any expected cell count that drops below 1E-07 times the average cell probability (1/number of non-structural zero cells) is set to zero during the iterations. |
maxits |
maximum number of iterations performed. The algorithm will stop if the parameter still has not converged after this many iterations. |
showits |
if |
Value
array like table
, but containing fitted values (expected
frequencies) under the loglinear model.
DETAILS
This function is usually used to compute ML estimates for a loglinear
model. For ML estimates, the array table
should contain the observed
frequencies from a cross-classified contingency table. Because this is
the "cell-means" version of IPF, the resulting fitted values will add
up to equals sum(table)
. To obtain estimated cell probabilities,
rescale the fitted values to sum to one.
This function may also be used to compute the posterior mode of the
multinomial cell probabilities under a Dirichlet prior. For a
posterior mode, set the elements of table
to (observed frequencies +
Dirichlet hyperparameters - 1). Then, after running IPF, rescale the
fitted values to sum to one.
Note
This function is essentially the same as the old S function loglin
, but
results are computed to double precision. See help(loglin)
for more
details.
References
Agresti, A. (1990) Categorical Data Analysis. J. Wiley & Sons, New York.
Schafer (1996) Analysis of Incomplete Multivariate Data. Chapman & Hall, Chapter 8.
See Also
Examples
data(HairEyeColor) # load data
m=c(1,2,0,1,3,0,2,3) # no three-way interaction
fit1 <- ipf(HairEyeColor,margins=m,
showits=TRUE) # fit model
X2 <- sum((HairEyeColor-fit1)^2/fit1) # Pearson chi square statistic
df <- prod(dim(HairEyeColor)-1) # Degrees of freedom for this example
1 - pchisq(X2,df) # p-value for fit1