yrcskew {logmult} | R Documentation |
Fitting Yamaguchi RC_SK Skew-Symmetric Association Model
Description
Fit a skew-symmetric association model proposed in Yamaguchi (1990) to describe asymmetry of square tables. This model can be combined with symmetric association models like a quasi-symmetry (the default) or symmetric (homogeneous) RC(M) models.
Usage
yrcskew(tab, nd.symm = NA, nd.skew = 1, diagonal = FALSE,
weighting = c("marginal", "uniform", "none"),
se = c("none", "jackknife", "bootstrap"),
nreplicates = 100, ncpus = getOption("boot.ncpus"),
family = poisson, weights = NULL,
start = NA, etastart = NULL, tolerance = 1e-8,
iterMax = 15000, trace = FALSE, verbose = TRUE, ...)
Arguments
tab |
a two-way table, or an object (such as a matrix) that can be coerced into a table; if present, dimensions above two will be collapsed. |
nd.symm |
the number of dimensions to include in the symmetric RC(M) association. Cannot exceed
|
nd.skew |
the number of dimensions to include in the skew-symmetric RC(M) association. |
diagonal |
should the model include parameters specific to each diagonal cell? This amounts to taking quasi-independence, rather than independence, as the baseline model. |
weighting |
what weights should be used when normalizing the scores. |
se |
which method to use to compute standard errors for parameters. |
nreplicates |
the number of bootstrap replicates, if enabled. |
ncpus |
the number of processes to use for jackknife or bootstrap parallel computing. Defaults to
the number of cores (see |
family |
a specification of the error distribution and link function
to be used in the model. This can be a character string naming
a family function; a family function, or the result of a call
to a family function. See |
weights |
an optional vector of weights to be used in the fitting process. |
start |
either |
etastart |
starting values for the linear predictor; set to |
tolerance |
a positive numeric value specifying the tolerance level for convergence; higher values will speed up the fitting process, but beware of numerical instability of estimated scores! |
iterMax |
a positive integer specifying the maximum number of main iterations to perform; consider raising this value if your model does not converge. |
trace |
a logical value indicating whether the deviance should be printed after each iteration. |
verbose |
a logical value indicating whether progress indicators should be printed, including a diagnostic error message if the algorithm restarts. |
... |
more arguments to be passed to |
Details
The original presented by Yamaguchi (1990), called “row-column-effect skew-symmetric
association (logbilinear) model with full quasi-symmetry (QS+RC_SK)”, combines a skew-symmetric association
with a quasi-symmetry baseline; it is the variant fitted by default by this function. If nd.symm
is
set to a positive integer value, though, variants using a RC(M) model to describe the symmetric association
are used, with our without diagonal-specific parameters (depending on the value of the diagonal
argument);
among them is the HM_RC+RC_SK variant, when nd.symm
is 1.
These models follow the equation:
log F_{ij} = q_{ij} + \delta_{i<j} \nu_i (\nu_j - \nu_i) - \delta_{i>j} \nu_j (\nu_i - \nu_j)
where F_{ij}
is the expected frequency for the cell at the intersection of row i and column j of
tab
, and q_{ij}
a quasi-symmetric or a RC(M) association. See reference for detailed information
about the degrees of freedom and the identification constraints applied to the scores.
Please note that contrary to other association models, this model is sensitive to reorderings of rows and columns. You have to take care of passing a table whose categories follow a hierarchical order with a substantive meaning.
Another model presented in the paper, the homogeneous symmetric and skew-symmetric associations models (HM_(S+SK)) is not currently supported.
Actual model fitting is performed using gnm
, which implements the Newton-Raphson algorithm.
This function simply ensures correct start values are used, in addition to allowing for identification
of scores even with several dimensions, computation of their jackknife or bootstrap standard errors, and plotting.
The default starting values for main parameters are taken from the model without association parameters
(“base model”); association parameters start with random starting values. In some complex
cases, using start = NULL
to get completely random starting values can be more efficient, but it is also
less stable and can converge to non-optimal solutions.
Value
A yrcskew
object, which is a subclass of an rc.symm
object (seerc
) unless
nd.symm
is NA
. In addition to this class, it contains a assoc.yrcskew
component
holding information about the skew-symmetric association:
phi |
The intrisic association parameters, one per dimension. |
row |
Row scores, normalized so that their (weighted) sum is 0, their (weighted) sum of squares is 1, and their (weighted) cross-dimensional correlation is null. |
col |
Column scores, normalized so that their (weighted) sum is 0, their (weighted) sum of squares is 1, and their (weighted) cross-dimensional correlation is null. |
row.weights |
The row weights used for the identification of scores, as specified by the
|
col.weights |
The column weights used for the identification of scores, as specified by the
|
covmat |
The variance-covariance matrix for phi coefficients and normalized row and column
scores. Only present if |
adj.covmats |
An array stacking on its third dimension one variance-covariance matrix for
the adjusted scores of each layer in the model (used for plotting). Only present if |
covtype |
The method used to compute the variance-covariance matrix (corresponding to the
|
Warning
This family of model sometimes converges to a non-optimal solution, in which case the reported scores are wrong. To protect yourself from this problem, you are advised to run the models several times to find out which convergence point is the true one. Furthermore, when model converges slowly, restarting the fitting procedure may produce much better random starting values.
Author(s)
Milan Bouchet-Valat
References
Yamaguchi, K. (1990). Some Models for the Analysis of Asymmetric Association in Square Contingency Tables with Ordered Categories. Sociol. Methodology 20, 181-212.
See Also
Examples
## Yamaguchi (1990), Table 5, p. 202, and Table 6B, p. 205
data(ocg1973)
# Simple symmetric RC(1) model ("Null skew-symmetry")
rc.model <- rc(ocg1973, diagonal=TRUE, symmetric=TRUE, weighting="none")
# Reported phi is slightly different, coefficients agree
rc.model
# Note model does not always converge, several attempts may be needed
# Here we set known starting values to be sure it works
set.seed(5)
model <- yrcskew(ocg1973, nd.symm=1, nd.skew=1, diagonal=TRUE, weighting="none")
# We do not get the same results as the author, but the smaller deviance
# indicates a better fit in our version (!)
model