aldvmm {aldvmm} | R Documentation |
Fitting Adjusted Limited Dependent Variable Mixture Models
Description
The function aldvmm
fits adjusted limited dependent variable mixture models
of health state utilities. Adjusted limited dependent variable mixture
models are finite mixtures of normal distributions with an accumulation of
density mass at the limits, and a gap between 100% quality of life and
the next smaller utility value. The package aldvmm
uses the
likelihood and expected value functions proposed by Hernandez Alava and
Wailoo (2015) using normal component distributions and a multinomial logit
model of probabilities of component membership.
Usage
aldvmm(
formula,
data,
subset = NULL,
psi,
ncmp = 2,
dist = "normal",
optim.method = NULL,
optim.control = list(trace = FALSE),
optim.grad = TRUE,
init.method = "zero",
init.est = NULL,
init.lo = NULL,
init.hi = NULL,
se.fit = FALSE,
model = TRUE,
level = 0.95,
na.action = "na.omit"
)
Arguments
formula |
an object of class "formula" with a symbolic
description of the model to be fitted. The model formula takes the form
|
data |
a data frame, list or environment (or object coercible to a data
frame by
|
subset |
an optional numeric vector of row indices of the subset of the model
matrix used in the estimation. |
psi |
a numeric vector of minimum and maximum possible utility values
smaller than or equal to 1 (e.g. |
ncmp |
a numeric value of the number of components that are mixed. The
default value is 2. A value of 1 represents a tobit model with a gap
between 1 and the maximum value in |
dist |
an optional character value of the distribution used in the
components. In this release, only the normal distribution is
available, and the default value is set to |
optim.method |
an optional character value of one of the following
|
optim.control |
an optional list of
|
optim.grad |
an optional logical value indicating if an analytical
gradient should be used in
|
init.method |
an optional character value indicating the method for
obtaining initial values. The following values are available:
|
init.est |
an optional numeric vector of user-defined initial values.
User-defined initial values override the |
init.lo |
an optional numeric vector of user-defined lower limits for
constrained optimization. When |
init.hi |
an optional numeric vector of user-defined upper limits for
constrained optimization. When |
se.fit |
an optional logical value indicating whether standard errors
of fitted values are calculated. The default value is |
model |
an optional logical value indicating whether the estimation
data frame is returned in the output object. The default value is
|
level |
a numeric value of the significance level for confidence bands of fitted values. The default value is 0.95. |
na.action |
a character value passed to
argument |
Details
aldvmm
fits
an adjusted limited dependent variable mixture model using the likelihood
and expected value functions from Hernandez Alava and Wailoo (2015). The
model accounts for latent classes, multi-modality, minimum and maximum
utility values and potential gaps between 1 and the next smaller utility
value. Adjusted limited dependent variable mixture models combine
multiple component distributions with a multinomial logit model of the
probabilities of component membership. The standard deviations of normal
distributions are estimated and reported as log-transformed values which
enter the likelihood function as exponentiated values to ensure
non-negative values.
The minimum utility and the largest utility smaller than or equal to 1 are
supplied in the argument 'psi'
. The number of
distributions/components that are mixed is set by the argument
'ncmp'
. When 'ncmp'
is set to 1 the procedure estimates a
tobit model with a gap between 1 and the maximum utility value in
'psi'
. The current version only allows finite mixtures of normal
distributions.
The 'formula'
object can include a |
delimiter to separate
formulae for expected values in components (left) and the multinomial
logit model of probabilities of group membership (right). If no |
delimiter is used, the same formula will be used for expected values in
components and the multinomial logit of the probabilities of component
membership.
aldvmm
uses
optimr
for
maximum likelihood estimation of model parameters. The argument
'optim.method'
accepts the following methods: "Nelder-Mead"
,
"BFGS"
, "CG"
, "L-BFGS-B"
, "nlminb"
,
"Rcgmin"
, "Rvmmin"
and "hjn"
. The default method is
"BFGS"
. The method "nlm"
cannot be used in
aldvmm
because it
requires a different implementation of the likelihood function. The
argument 'optim.control'
accepts a list of
optimr
control parameters. If 'optim.grad'
is set to TRUE
the
function
optimr
uses
analytical gradients during the optimization procedure for all methods
that allow for this approach. If 'optim.grad'
is set to
FALSE
or a method cannot use gradients, a finite difference
approximation is used. The hessian matrix at maximum likelihood parameters
is approximated numerically using
hessian
.
'init.method'
accepts four values of methods for generating initial
values: "zero"
, "random"
, "constant"
, "sann"
.
The method "zero"
sets initial values of all parameters to 0. The
method "random"
draws random starting values from a standard normal
distribution. The method "constant"
estimates a constant-only
model and uses estimates as initial values of intercepts and standard
errors and 0 for all other parameters. The method "sann"
estimates
the full model using the simulated annealing optimization method in
optim
and uses
parameter estimates as initial values. When user-specified initial values
are supplied in 'init.est'
, the argument 'init.method'
is
ignored.
By default, aldvmm
performs unconstrained optimization with upper and lower limits at
-Inf
and Inf
. When user-defined lower and upper limits are
supplied to 'init.lo'
and/or 'init.hi'
, these default limits
are replaced with the user-specified values, and the method
"L-BFGS-B"
is used for box-constrained optimization instead of the
user defined 'optim.method'
. It is possible to only set either
maximum or minimum limits. When initial values supplied to
'init.est'
or from default methods lie outside the limits, the
in-feasible values will be set to the limits using the function
bmchk
.
The function aldvmm()
returns the negative log-likelihood, Akaike
information criterion and Bayesian information criterion. Smaller values
of these measures indicate better fit.
If 'se.fit'
is set to TRUE
, standard errors of fitted values
are calculated using the delta method. The standard errors of fitted
values in the estimation data set are calculated as se_{fit} =
\sqrt{G^{t} \Sigma G}
, where G
is the gradient of a fitted value with respect to changes of parameter
estimates, and \Sigma
is the estimated covariance matrix of
parameters (Dowd et al., 2014). The standard errors of predicted values
in new data sets are calculated as se_{pred} = \sqrt{MSE + G^{t}
\Sigma G}
, where
MSE
is the mean squared error of fitted versus observed
outcomes in the original estimation data (Whitmore, 1986).
The generic function
summary
can be
used to obtain or print a summary of the results. The generic function
predict
can
be used to obtain predicted values and standard errors of predictions in
new data.
Value
aldvmm
returns an object of class "aldvmm". An object of class
"aldvmm" is a list containing the following objects.
coef |
a numeric vector of parameter estimates. |
hessian |
a numeric matrix object with second partial derivatives of the likelihood function. |
cov |
a numeric matrix object with covariances of parameters. |
n |
a scalar representing the number of observations that were used in the estimation. |
k |
a scalar representing the number of components that were mixed. |
df.null |
an integer value of the residual degrees of freedom of a null model including intercepts and standard errors. |
df.residual |
an integer value of the residual degrees of freedom.. |
iter |
an integer value of the number of iterations used in optimization. |
convergence |
an integer value indicating convergence. "0" indicates successful completion. |
gof |
a list including the following elements.
|
pred |
a list including the following elements.
|
init |
a list including the following elements.
|
call |
a character value including the model call captured by
|
formula |
an object of class "formula" supplied to argument
|
terms |
a list of objects of class "terms" for the model of component means ("beta"), probabilities of component membership ("delta") and the full model ("full"). |
contrasts |
a nested list of character values showing contrasts of factors used in models of component means ("beta") and probabilities of component membership ("delta"). |
data |
a data frame created by
|
psi |
a numeric vector with the minimum and maximum utility
below 1 in |
dist |
a character value indicating the used component distributions. |
label |
a list including the following elements.
|
optim.method |
a character value of the used
|
level |
a numeric value of the confidence level used for reporting. |
na.action |
an object of class "omit" extracted from the
"na.action" attribute of the data frame created by
|
References
Alava, M. H. and Wailoo, A. (2015) Fitting adjusted limited dependent variable mixture models to EQ-5D. The Stata Journal, 15(3), 737–750. doi:10.1177/1536867X1501500307
Dowd, B. E., Greene, W. H., and Norton, E. C. (2014) Computation of standard errors. Health services research, 49(2), 731–750. doi:10.1111/1475-6773.12122
Whitmore, G. A. (1986) Prediction limits for a univariate normal observation. The American Statistician, 40(2), 141–143. doi:10.1080/00031305.1986.10475378
Examples
data(utility)
fit <- aldvmm(eq5d ~ age + female | 1,
data = utility,
psi = c(0.883, -0.594),
ncmp = 2)
summary(fit)
yhat <- predict(fit)