npmlt {mixcat} | R Documentation |
Mixed effects cumulative link and logistic regression models
Description
Fits cumulative logit and baseline logit and link mixed effects regression models with non- parametric distribution for the random effects.
Usage
npmlt(formula, formula.npo=~1, random=~1, id, k=1, eps=0.0001,
start.int=NULL, start.reg=NULL, start.mp=NULL,
start.m=NULL, link="clogit",
EB=FALSE, maxit=500, na.rm=TRUE, tol=0.0001)
Arguments
formula |
a formula defining the response and the fixed, proportional odds, effects part of the model,
e.g. |
formula.npo |
a formula defining non proportional odds variables of the model.
A response is not needed as it has been provided in |
random |
a formula defining the random part of the model. For instance, |
id |
a factor that defines the primary sampling units, e.g.
groups, clusters, classes, or individuals in longitudinal studies. These sampling units
have their own random coefficient, as defined in |
k |
the number of mass points and masses for the non-parametric (discrete) random effects distribution.
If |
eps |
positive convergence tolerance |
start.int |
a vector of length (number of categories minus one) with the starting values the fixed intercept(s). |
start.reg |
a vector with the starting values for the regression coefficients. One
starting value for the proportional odds effects and (number of categories minus one)
starting values for the non proportional effects, in the same order as they appear
in |
start.mp |
starting values for the mass points of the random effects distribution in the form:
( |
start.m |
starting values for the masses of the random effects distribution: a vector of
length |
link |
for a cumulative logit model set |
EB |
if |
maxit |
integer giving the maximal number of iterations of the fitting algorithm until convergence. By default this number is set to 500. |
na.rm |
a logical value indicating whether NA values should be stripped before the computation proceeds. |
tol |
positive tolerance level used for calculating generalised inverses (g-inverses). Consider matrix
|
Details
Maximizing a likelihood over an unspecified random effects distribution results in a
discrete mass point estimate of this distribution (Laird, 1978; Lindsay, 1983).
Thus, the terms ‘non-parametric’ (NP) and ‘discrete’ random effects distribution are used here
interchangeably. Function npmlt
allows the user to choose the number k
of mass
points/masses of the discrete distribution, a choice that should be based on the log-likelihood.
Note that the mean of the NP distribution is constrained to be zero and thus for k=1
the fitted model is equivalent to a fixed effects model. For k>1
and a random slope in the model,
the mass points are bivariate with a component that corresponds to the intercept and another
that corresponds to the slope.
General treatments of non-parametric modeling can be found in Aitkin, M. (1999) and Aitkin et al. (2009). For more details on multinomial data see Hartzel et al (2001).
The response variable y
can be binary or multinomial. A binary response should
take values 1 and 2, and the function npmlt
will model the probability of 1. For an ordinal
response, taking values 1,\dots,q
, a cumulative logit model can be fit. Ignoring the random effects,
such a model, with formula y~x
, takes the form
log \frac{P(Y \le r)}{1-P(Y \le r)}=\beta_r + \gamma x,
where \beta_r, r=1,\dots,q-1
, are the cut-points and \gamma
is the slope.
Further, if argument formula.npo is specified as ~x
, the model becomes
log \frac{P(Y \le r)}{1-P(Y \le r)}=\beta_r + \gamma_r x,
Similarly, for a nominal response with q categories, a baseline logit model can be fit.
The fixed effects part of the model, y~x
, takes the form,
log \frac{P(Y=r)}{P(Y=q)} = \beta_r + \gamma x,
where r=1,\dots,q-1.
Again, formula.npo can be specified as ~x
, in which
case slope \gamma
will be replaced by category specific slopes, \gamma_r
.
The user is provided with the option of specifying starting values for some or all the model parameters. This option allows for starting the algorithm at different starting points, in order to ensure that it has convered to the point of maximum likelihood. Further, if the fitting algorithm fails, the user can start by fitting a less complex model and use the estimates of this model as starting values for the more complex one.
With reference to the tol
argument, the fitting
algorithm calculates g-inverses of two matrices: 1. the information matrix of the model,
and 2. the covariance matrix of multinomial proportions. The covariance matrix of a
multinomial proportion p
of length q
is calculated as Diag\{p*\} -p* p*^T
,
where p*
is of length q-1
. A g-inverse for this matrix is needed because elements
of p*
can become zero or one.
Value
The function npmlt
returns an object of class ‘npmreg’, a list containing at least the following components:
call |
the matched call. |
formula |
the formula supplied. |
formula.npo |
the formula for the non proportional odds supplied. |
random |
the random effects formula supplied. |
coefficients |
a named vector of regression coefficients. |
mass.points |
a vector or a table that contains the mass point estimates. |
masses |
the masses (probabilities) corresponding to the mass points. |
vcvremat |
the estimated variance-covariance matrix of the random effects. |
var.cor.mat |
the estimated variance-covariance matrix of the random effects, with the upper triangular covariances replaced by the corresponding correlations. |
m2LogL |
minus twice the maximized log-likelihood of the chosen model. |
SE.coefficients |
a named vector of standard errors of the estimated regression coefficients. |
SE.mass.points |
a vector or a table that contains the the standard errors of the estimated mass points. |
SE.masses |
the standard errors of the estimated masses. |
VRESE |
the standard errors of the estimates of the variances of random effects. |
CVmat |
the inverse of the observed information matrix of the model. |
eBayes |
if |
fitted |
the fitted values of the linear predictors computed at the empirical Bayes estimates of the random effects, if |
prob |
the estimated probabilities of observing a response at one of the categories. These probabilities are computed at the empirical Bayes estimates of the random effects, if |
nrp |
number of random slopes specified. |
iter |
the number of iterations of the fitting algorithm. |
maxit |
the maximal allowed number of iterations of the fitting algorithm until convergence. |
flagcvm |
last iteration at which eigenvalue(s) of covariance matrix of multinomial variable were less than |
flaginfo |
last iteration at which eigenvalue(s) of model information matrix were less than |
Author(s)
Georgios Papageorgiou gpapageo@gmail.com
References
Aitkin, M. (1999). A general maximum likelihood analysis of variance components in generalized linear models. Biometrics 55, 117-128.
Aitkin, M., Francis, B., Hinde, J., and Darnell, R. (2009). Statistical Modelling in R. Oxford Statistical Science Series, Oxford, UK.
Hedeker, D. and Gibbons, R. (2006). Longitudinal Data Analysis. Wiley, Palo Alto, CA.
Hartzel, J., Agresti, A., and Caffo, B. (2001). Multinomial logit random effects models. Statistical Modelling, 1(2), 81-102.
Laird, N. (1978). Nonparametric maximum likelihood estimation of a mixing distribution. Journal of the American Statistical Association, 73, 805-811.
Lindsay, B. G. (1983). The geometry of mixture likelihoods, Part II: The exponential family. The Annals of Statistics, 11, 783-792.
See Also
Examples
data(schizo)
attach(schizo)
npmlt(y~trt*sqrt(wk),formula.npo=~trt,random=~1+trt,id=id,k=2,EB=FALSE)