maxG {LCAvarsel} | R Documentation |
Maximum number of latent classes
Description
Finds the number of latent classes that are allowed to be fitted on a dataset in order for the latent class analysis model to be identifiable.
Usage
maxG(Y, Gvec)
Arguments
Y |
A categorical data matrix. |
Gvec |
A numeric vector denoting the range of number of latent classes to be fitted. |
Details
In practice, different latent class analysis models are fitted by attributing different values to G
, usually ranging from 1 to G_{max}
. However, for a set of variables, not all the models corresponding to increasing values of G
are identifiable. Indeed, a necessary (but not sufficient) condition for a latent class analysis model to be identifiable is:
\prod_{j=1}^M C_j > G\Biggl(\, \sum_{j=1}^M C_j - M + 1\Biggr)
where C_j
denotes the number of categories of variable j
, j=1,...,M
, and M
is the number of variables in the data Y
. Another condition requires the number of observed distinct configurations of the variables in the data to be greater than the number of parameters of the model. The function returns the subset of values of vector Gvec
such that both the above conditions are satisfied.
Value
A numeric vector containing the subset of number of latent classes that are allowed to be fitted on the data in order for the model to be identifiable. If no model is identifiable for the range of values provided, the function returns NULL
and throws a warning.
References
Bartholomew, D. and Knott, M. and Moustaki, I. (2011). Latent Variable Models and Factor Analysis: A Unified Approach. Wiley.
Goodman, L. A. (1974). Exploratory latent structure analysis using both identifiable and unidentifiable models. Biometrika. 61, 215-231.
Examples
data(carcinoma, package = "poLCA")
maxG(carcinoma, 1:4)
maxG(carcinoma, 2:3)
maxG(carcinoma, 5) # the model is not identifiable