R: Estimating a contingency table using model-based approaches

ObtainModelEstimates {mipfp}

R Documentation

Estimating a contingency table using model-based approaches

Description

This function provides several alternative estimating methods to the IPFP when estimating a multiway table subject to known constrains/totals: maximum likelihood method (ML), minimum chi-squared (CHI2) and weighted least squares (WLSQ). Note that the resulting estimators are probabilities.

The covariance matrix of the estimated proportions (as defined by Little and Wu, 1991) are also provided. Also in the case of the ML method, the covariance matrix defined by Lang (2004) is also returned.

Usage

ObtainModelEstimates(seed, target.list, target.data, method="ml", 
                     tol.margins = 1e-10, replace.zeros = 1e-10, ...)

Arguments

`seed`	The initial multi-dimensional array to be updated. Each cell must be non-negative.
`target.list`	A list of the target margins provided in `target.data`. Each component of the list is an array whose cells indicates which dimension the corresponding margin relates to.
`target.data`	A list containing the data of the target margins. Each component of the list is an array storing a margin. The list order must follow the one defined in `target.list`. Note that the cells of the arrays must be non-negative.
`method`	Determine the model to be used for estimating the contingency table. By default the method is `ml` (maximum likelihood); other options available are `chi2` (minimum chi-squared) and `lsq` (least squares).
`tol.margins`	Tolerance for the margins consistency. Default is `1e^{-10}`.
`replace.zeros`	Constant that is added to zero cell found in the seed, as procedures require strictly positive cells. Default value is `1e^{-10}`.
`...`	Additional parameters that can be passed to control the optimisation process (see solnp from the package Rsolnp).

Value

A list containing the final estimated table as well as the covariance matrix of the estimated proportion and other convergence informations.

`x.hat`	Array of the estimated table frequencies.
`p.hat`	Array of the estimated table probabilities.
`error.margins`	For each list element of `target.data`, `check.margins` shows the maximum absolute deviation between the element and the corresponding estimated margin. Note that the deviations should approximate zero, otherwise the target margins are not met.
`solnp.res`	The estimation process uses the `solnp` optimisation function from the R package Rsolnp and `solnp.res` is the corresponding object returned by the solver.
`conv`	A boolean indicating whether the algorithm converged to a solution.
`method`	The selected method for estimation.
`call`	The matched call.

Note

It is important to note that if the margins given in target.list are not consistent (i.e. the sums of their cells are not equals), the input data is then normalised by considering probabilities instead of frequencies:

the cells of the seed are divided by sum(seed);
the cells of each margin i of the list target.data are divided by sum(target.data[[i]]).

Author(s)

Thomas Suesse

Maintainer: Johan Barthelemy <johan@uow.edu.au>.

References

Lang, J.B. (2004) Multinomial-Poisson homogeneous models for contingency tables. Annals of Statistics 32(1): 340-383.

Little, R. J., Wu, M. M. (1991) Models for contingency tables with known margins when target and sampled populations differ. Journal of the American Statistical Association 86 (413): 87-95.

Examples

# set-up an initial 3-way table of dimension (2 x 2 x 2)
seed <- Vector2Array(c(80, 60, 20, 20, 40, 35, 35, 30), dim = c(c(2, 2, 2)))

# building target margins
margins12 <- c(2000, 1000, 1500, 1800)
margins12.array <- Vector2Array(margins12, dim=c(2, 2))
margins3 <- c(4000,2300)
margins3.array <- Vector2Array(margins3, dim = 2) 
target.list <- list(c(1, 2), 3)
target.data <- list(margins12.array, margins3.array)

# estimating the new contingency table using the ml method
results.ml <- ObtainModelEstimates(seed, target.list, target.data, 
                                   compute.cov = TRUE)
print(results.ml)

# estimating the new contingency table using the chi2 method
results.chi2 <- ObtainModelEstimates(seed, target.list, target.data, 
                                     method = "chi2", compute.cov = TRUE)
print(results.chi2)

# estimating the new contingency table using the lsq method
results.lsq <- ObtainModelEstimates(seed, target.list, target.data,
                                    method = "lsq", compute.cov = TRUE)
print(results.lsq)

[Package mipfp version 3.2.1 Index]