R: Wald, Log-likelihood ratio and Person Chi-square statistics...

gof.estimates {mipfp}

R Documentation

Wald, Log-likelihood ratio and Person Chi-square statistics for mipfp object

Description

This method computes three statistics to perform a test wheter the seed agrees with the target data. The statistics are the Wilk's log-likelihood ratio statistic, the Wald statistic and the Person Chi-square statistic.

The method also returns the associated degrees of freedom.

Usage

## S3 method for class 'mipfp'
gof.estimates(object, seed = NULL, target.data = NULL, 
              target.list = NULL, replace.zeros = 1e-10, ...)

Arguments

`object`	The object of class `mipfp` containing.
`seed`	The seed used to compute the estimates (optional). If not provided, the method tries to determine the `seed` automatically.
`target.data`	A list containing the data of the target margins. Each component of the list is an array storing a margin. The list order must follow the one defined in `target.list`. Note that the cells of the arrays must be non-negative (and can even be NA if `method = ipfp`) (optional). If not provided, the method tries to dermine `target.data` automatically.
`target.list`	A list of the target margins provided in `target.data`. Each component of the list is an array whose cells indicates which dimension the corresponding margin relates to (optional). If not provided, the method tries to determine `target.list` automatically.
`replace.zeros`	If 0-cells are to be found, then they are replaced with this value.
`...`	Not used.

Details

The test is formally expressed as:

H_0 ~ : ~ h(\pi) = 0 \quad vs \quad H_1~:~h(\pi) \neq 0

where \pi is the vector of the seed probabilities and h(x) = A^T x - m with A and m being respectively the marginal matrix and the margins vector of the estimation problem.

The three statistics are then defined as:

Wilk's log-likelihoold ratio

G^2 = 2 \sum x_i \ln \frac{\pi_i}{\hat{\pi}_i}
Wald's statistic

W^2 = h(x)^T ( H^T_x D_x H_x)^{-1} h(x)
Pearson Chi-square

\chi^2 = (x - n \hat{\pi})^T D^{-1}_{n\hat{\pi}} (x - n \hat{\pi})

where x is the vectorization of the seed, n = \sum x_i, D_v is a diagonal matrix derived from the vector v and H denotes the Jacobian evaluated in \hat{\pi} (the vector of the estimated probabilities) of the function h(x).

The degrees of freedom for these statistics corresponds to the number of components in m.

Value

A list whose elements are detailed below.

`G2`	The Log-likelihood statistic.
`W2`	The Wald statistic.
`X2`	The Pearson chi-squared statistic.
`stats.df`	The degrees of freedom for the `G2`, `W2` and `X2` statistics.

Author(s)

Johan Barthelemy

Maintainer: Johan Barthelemy johan@uow.edu.au.

References

Lang, J.B. (2004) Multinomial-Poisson homogeneous models for contingency tables. Annals of Statistics 32(1): 340-383.

Examples

# loading the data
data(spnamur, package = "mipfp")
# subsetting the data frame, keeping only the first 3 variables
spnamur.sub <- subset(spnamur, select = Household.type:Prof.status)
# true table
true.table <- table(spnamur.sub)
# extracting the margins
tgt.v1        <- apply(true.table, 1, sum)
tgt.v1.v2     <- apply(true.table, c(1,2), sum)
tgt.v2.v3     <- apply(true.table, c(2,3), sum)
tgt.list.dims <- list(1, c(1,2), c(2,3))
tgt.data      <- list(tgt.v1, tgt.v1.v2, tgt.v2.v3)
# creating the seed, a 10 pct sample of spnamur
seed.df <- spnamur.sub[sample(nrow(spnamur), round(0.10*nrow(spnamur))), ]
seed.table <- table(seed.df)
# applying one fitting method (ipfp)
r.ipfp <- Estimate(seed=seed.table, target.list=tgt.list.dims, 
                   target.data = tgt.data)
# printing the G2, X2 and W2 statistics
print(gof.estimates(r.ipfp))
# alternative way (pretty printing, with p-values)
print(summary(r.ipfp)$stats.gof)

[Package mipfp version 3.2.1 Index]