R: Pairwise maximum likelihood fit statistics

lavTablesFitCp {lavaan}

R Documentation

Pairwise maximum likelihood fit statistics

Description

Three measures of fit for the pairwise maximum likelihood estimation method that are based on likelihood ratios (LR) are defined: C_F, C_M, and C_P. Subscript F signifies a comparison of model-implied proportions of full response patterns with observed sample proportions, subscript M signifies a comparison of model-implied proportions of full response patterns with the proportions implied by the assumption of multivariate normality, and subscript P signifies a comparison of model-implied proportions of pairs of item responses with the observed proportions of pairs of item responses.

Usage

lavTablesFitCf(object)
lavTablesFitCp(object, alpha = 0.05)
lavTablesFitCm(object)

Arguments

`object`	An object of class `lavaan`.
`alpha`	The nominal level of signifiance of global fit.

Details

`C_F`

The C_F statistic compares the log-likelihood of the model-implied proportions (\pi_r) with the observed proportions (p_r) of the full multivariate responses patterns:

C_F = 2N\sum_{r}p_{r}\ln[p_{r}/\hat{\pi}_{r}],

which asymptotically has a chi-square distribution with

df_F = m^k - n - 1,

where k denotes the number of items with discrete response scales, m denotes the number of response options, and n denotes the number of parameters to be estimated. Notice that C_F results may be biased because of large numbers of empty cells in the multivariate contingency table.

`C_M`

The C_M statistic is based on the C_F statistic, and compares the proportions implied by the model of interest (Model 1) with proportions implied by the assumption of an underlying multivariate normal distribution (Model 0):

C_M = C_{F1} - C_{F0},

where C_{F0} is C_F for Model 0 and C_{F1} is C_F for Model 1. Statistic C_M has a chi-square distribution with degrees of freedom

df_M = k(k-1)/2 + k(m-1) - n_{1},

where k denotes the number of items with discrete response scales, m denotes the number of response options, and k(k-1)/2 denotes the number of polychoric correlations, k(m-1) denotes the number of thresholds, and n_1 is the number of parameters of the model of interest. Notice that C_M results may be biased because of large numbers of empty cells in the multivariate contingency table. However, bias may cancels out as both Model 1 and Model 0 contain the same pattern of empty responses.

`C_P`

With the C_P statistic we only consider pairs of responses, and compare observed sample proportions (p) with model-implied proportions of pairs of responses(\pi). For items i and j we obtain a pairwise likelihood ratio test statistic C_{P_{ij}}

C_{P_{ij}}=2N\sum_{c_i=1}^m \sum_{c_j=1}^m p_{c_i,c_j}\ln[p_{c_i,c_j}/\hat{\pi}_{c_i,c_j}],

where m denotes the number of response options and N denotes sample size. The C_P statistic has an asymptotic chi-square distribution with degrees of freedom equal to the information (m^2 -1) minus the number of parameters (2(m-1) thresholds and 1 correlation),

df_P = m^{2} - 2(m - 1) - 2.

As k denotes the number of items, there are k(k-1)/2 possible pairs of items. The C_P statistic should therefore be applied with a Bonferroni adjusted level of significance \alpha^*, with

\alpha^*= \alpha /(k(k-1)/2)),

to keep the family-wise error rate at \alpha. The hypothesis of overall goodness-of-fit is tested at \alpha and rejected as soon as C_P is significant at \alpha^* for at least one pair of items. Notice that with dichotomous items, m = 2, and df_P = 0, so that hypothesis can not be tested.

References

Barendse, M. T., Ligtvoet, R., Timmerman, M. E., & Oort, F. J. (2016). Structural Equation Modeling of Discrete data: Model Fit after Pairwise Maximum Likelihood. Frontiers in psychology, 7, 1-8.

Joreskog, K. G., & Moustaki, I. (2001). Factor analysis of ordinal variables: A comparison of three approaches. Multivariate Behavioral Research, 36, 347-387.

Examples

# Data
HS9 <- HolzingerSwineford1939[,c("x1","x2","x3","x4","x5",
                                 "x6","x7","x8","x9")]
HSbinary <- as.data.frame( lapply(HS9, cut, 2, labels=FALSE) )

# Single group example with one latent factor
HS.model <- ' trait =~ x1 + x2 + x3 + x4 '
fit <- cfa(HS.model, data=HSbinary[,1:4], ordered=names(HSbinary[,1:4]),
           estimator="PML")
lavTablesFitCm(fit)
lavTablesFitCp(fit)
lavTablesFitCf(fit)

[Package lavaan version 0.6-18 Index]