msm {msm} | R Documentation |
Multi-state Markov and hidden Markov models in continuous time
Description
Fit a continuous-time Markov or hidden Markov multi-state model by maximum likelihood. Observations of the process can be made at arbitrary times, or the exact times of transition between states can be known. Covariates can be fitted to the Markov chain transition intensities or to the hidden Markov observation process.
Usage
msm(
formula,
subject = NULL,
data = list(),
qmatrix,
gen.inits = FALSE,
ematrix = NULL,
hmodel = NULL,
obstype = NULL,
obstrue = NULL,
covariates = NULL,
covinits = NULL,
constraint = NULL,
misccovariates = NULL,
misccovinits = NULL,
miscconstraint = NULL,
hcovariates = NULL,
hcovinits = NULL,
hconstraint = NULL,
hranges = NULL,
qconstraint = NULL,
econstraint = NULL,
initprobs = NULL,
est.initprobs = FALSE,
initcovariates = NULL,
initcovinits = NULL,
deathexact = NULL,
death = NULL,
exacttimes = FALSE,
censor = NULL,
censor.states = NULL,
pci = NULL,
phase.states = NULL,
phase.inits = NULL,
cl = 0.95,
fixedpars = NULL,
center = TRUE,
opt.method = "optim",
hessian = NULL,
use.deriv = TRUE,
use.expm = TRUE,
analyticp = TRUE,
na.action = na.omit,
...
)
Arguments
formula |
A formula giving the vectors containing the observed states and the corresponding observation times. For example,
Observed states should be numeric variables in the set The times can indicate different types of observation scheme, so be careful
to choose the correct For hidden Markov models, |
subject |
Vector of subject identification numbers for the data
specified by |
data |
Optional data frame in which to interpret the variables supplied
in |
qmatrix |
Matrix which indicates the allowed transitions in the
continuous-time Markov chain, and optionally also the initial values of
those transitions. If an instantaneous transition is not allowed from state
If supplying initial values yourself, then the non-zero entries should be
those values. If using For example,
represents a 'health - disease - death' model, with initial transition intensities 0.1 from health to disease, 0.01 from health to death, 0.1 from disease to health, and 0.2 from disease to death. If the states represent ordered levels of severity of a disease, then this
matrix should usually only allow transitions between adjacent states. For
example, if someone was observed in state 1 ("mild") at their first
observation, followed by state 3 ("severe") at their second observation,
they are assumed to have passed through state 2 ("moderate") in between, and
the 1,3 entry of The initial intensities given here are with any covariates set to their
means in the data (or set to zero, if |
gen.inits |
If |
ematrix |
If misclassification between states is to be modelled, this
should be a matrix of initial values for the misclassification
probabilities. The rows represent underlying states, and the columns
represent observed states. If an observation of state
represents a model in which misclassifications are only permitted between adjacent states. If any probabilities are constrained to be equal using For an alternative way of specifying misclassification models, see
|
hmodel |
Specification of the hidden Markov model (HMM). This should
be a list of return values from HMM constructor functions. Each element of
the list corresponds to the outcome model conditionally on the corresponding
underlying state. Univariate constructors are described in
the For example, consider a three-state hidden Markov model. Suppose the observations in underlying state 1 are generated from a Normal distribution with mean 100 and standard deviation 16, while observations in underlying state 2 are Normal with mean 54 and standard deviation 18. Observations in state 3, representing death, are exactly observed, and coded as 999 in the data. This model is specified as
The mean and standard deviation parameters are estimated starting from these
initial values. If multiple parameters are constrained to be equal using
See the A misclassification model, that is, a hidden Markov model where the outcomes
are misclassified observations of the underlying states, can either be
specified using a list of For example,
is equivalent to
|
obstype |
A vector specifying the observation scheme for each row of
the data. This can be included in the data frame
If This is a generalisation of the
|
obstrue |
In misclassification models specified with In HMMs specified with HMMs where the true state is known to be within a specific set at specific
times can be defined with a combination of |
covariates |
A formula or a list of formulae representing the covariates on the transition intensities via a log-linear model. If a single formula is supplied, like
then these covariates are assumed to apply to all intensities. If a named list is supplied, then this defines a potentially different model for each named intensity. For example,
specifies an age effect on the state 1 - state 2 transition, additive age
and treatment effects on the state 2 - state 3 transition, but no covariates
on any other transitions that are allowed by the If covariates are time dependent, they are assumed to be constant in between
the times they are observed, and the transition probability between a pair
of times |
covinits |
Initial values for log-linear effects of covariates on the transition intensities. This should be a named list with each element corresponding to a covariate. A single element contains the initial values for that covariate on each transition intensity, reading across the rows in order. For a pair of effects constrained to be equal, the initial value for the first of the two effects is used. For example, for a model with the above For factor covariates, name each level by concatenating the name of the
covariate with the level name, quoting if necessary. For example, for a
covariate
If not specified or wrongly specified, initial values are assumed to be zero. |
constraint |
A list of one numeric vector for each named covariate. The
vector indicates which covariate effects on intensities are constrained to
be equal. Take, for example, a model with five transition intensities and
two covariates. Specifying
constrains the effect of age to be equal for the first three intensities, and equal for the fourth and fifth. The effect of treatment is assumed to be different for each intensity. Any vector of increasing numbers can be used as indicators. The intensity parameters are assumed to be ordered by reading across the rows of the transition matrix, starting at the first row, ignoring the diagonals. Negative elements of the vector can be used to indicate that particular covariate effects are constrained to be equal to minus some other effects. For example:
constrains the second and third age effects to be equal, the first effect to be minus the second, and the fifth age effect to be minus the fourth. For example, it may be realisitic that the effect of a covariate on the "reverse" transition rate from state 2 to state 1 is minus the effect on the "forward" transition rate, state 1 to state 2. Note that it is not possible to specify exactly which of the covariate effects are constrained to be positive and which negative. The maximum likelihood estimation chooses the combination of signs which has the higher likelihood. For categorical covariates, defined as factors, specify constraints as
follows:
where Make sure the
sets the first (baseline) level of unordered factors to zero, then the baseline level is ignored in this specification. To assume no covariate effect on a certain transition, use the
|
misccovariates |
A formula representing the covariates on the
misclassification probabilities, analogously to This must be a single formula - lists are not supported, unlike
|
misccovinits |
Initial values for the covariates on the
misclassification probabilities, defined in the same way as |
miscconstraint |
A list of one vector for each named covariate on
misclassification probabilities. The vector indicates which covariate
effects on misclassification probabilities are constrained to be equal,
analogously to |
hcovariates |
List of formulae the same length as |
hcovinits |
Initial values for the hidden Markov model covariate
effects. A list of the same length as |
hconstraint |
A named list. Each element is a vector of constraints on the named hidden Markov model parameter. The vector has length equal to the number of times that class of parameter appears in the whole model. For example consider the three-state hidden Markov model described above,
with normally-distributed outcomes for states 1 and 2. To constrain the
outcome variance to be equal for states 1 and 2, and to also constrain the
effect of
Note this excludes initial state occupancy probabilities and covariate effects on those probabilities, which cannot be constrained. |
hranges |
Range constraints for hidden Markov model parameters.
Supplied as a named list, with each element corresponding to the named
hidden Markov model parameter. This element is itself a list with two
elements, vectors named "lower" and "upper". These vectors each have length
equal to the number of times that class of parameter appears in the whole
model, and give the corresponding mininum amd maximum allowable values for
that parameter. Maximum likelihood estimation is performed with these
parameters constrained in these ranges (through a log or logit-type
transformation). Lower bounds of For example, in the three-state model above, to constrain the mean for state 1 to be between 0 and 6, and the mean of state 2 to be between 7 and 12, supply
These default to the natural ranges, e.g. the positive real line for
variance parameters, and [0,1] for probabilities. Therefore Initial values should be strictly within any ranges, and not on the range boundary, otherwise optimisation will fail with a "non-finite value" error. |
qconstraint |
A vector of indicators specifying which baseline transition intensities are equal. For example,
constrains the third and fourth intensities to be equal, in a model with
four allowed instantaneous transitions. When there are covariates on the
intensities and |
econstraint |
A similar vector of indicators specifying which baseline
misclassification probabilities are constrained to be equal. Only used if
the model is specified using |
initprobs |
Only used in hidden Markov models. Underlying state
occupancy probabilities at each subject's first observation. Can either be
a vector of If these are estimated (see |
est.initprobs |
Only used in hidden Markov models. If Note that the free parameters during this estimation exclude the state 1 occupancy probability, which is fixed at one minus the sum of the other probabilities. |
initcovariates |
Formula representing covariates on the initial state
occupancy probabilities, via multinomial logistic regression. The linear
effects of these covariates, observed at the individual's first observation
time, operate on the log ratio of the state |
initcovinits |
Initial values for the covariate effects
|
deathexact |
Vector of indices of absorbing states whose time of entry
is known exactly, but the individual is assumed to be in an unknown
transient state ("alive") at the previous instant. This is the usual
situation for times of death in chronic disease monitoring data. For
example, if you specify See the The Note that you do not always supply a |
death |
Old name for the |
exacttimes |
By default, the transitions of the Markov process are
assumed to take place at unknown occasions in between the observation times.
If Note that the complete history of the multi-state process is known with this type of data. The models which msm fits have the strong assumption of constant (or piecewise-constant) transition rates. Knowing the exact transition times allows more realistic models to be fitted with other packages. For example parametric models with sojourn distributions more flexible than the exponential can be fitted with the flexsurv package, or semi-parametric models can be implemented with survival in conjunction with mstate. |
censor |
A state, or vector of states, which indicates censoring.
Censoring means that the observed state is known only to be one of a
particular set of states. For example, Note that in contrast to the usual terminology of survival analysis, here it
is the state which is considered to be censored, rather than the
event time. If at the end of a study, an individual has not died,
but their true state is known, then For hidden Markov models, censoring may indicate either a set of possible
observed states, or a set of (hidden) true states. The later case is
specified by setting the relevant elements of Note in particular that general time-inhomogeneous Markov models with
piecewise constant transition intensities can be constructed using the
Not supported for multivariate hidden Markov models specified with
|
censor.states |
Specifies the underlying states which censored
observations can represent. If
means that observations coded 99 represent either state 2 or state 3, while observations coded 999 are really either state 3 or state 4. |
pci |
Model for piecewise-constant intensities. Vector of cut points defining the times, since the start of the process, at which intensities change for all subjects. For example
specifies that the intensity changes at time points 5 and 10. This will
automatically construct a model with a categorical (factor) covariate called
Thus if To assume piecewise constant intensities for some transitions but not others
with Internally, this works by inserting censored observations in the data at times when the intensity changes but the state is not observed. If the supplied times are outside the range of the time variable in the
data, After fitting a time-inhomogeneous model,
This facility does not support interactions between time and other
covariates. Such models need to be specified "by hand", using a state
variable with censored observations inserted. Note that the Note that you do not need to use
|
phase.states |
Indices of states which have a two-phase sojourn distribution. This defines a semi-Markov model, in which the hazard of an onward transition depends on the time spent in the state. This uses the technique described by Titman and Sharples (2009). A hidden Markov model is automatically constructed on an expanded state space, where the phases correspond to the hidden states. The "tau" proportionality constraint described in this paper is currently not supported. Covariates, constraints, Hidden Markov models can additionally be given phased states. The user
supplies an outcome distribution for each original state using
Output functions are presented as it were a hidden Markov model on the expanded state space, for example, transition probabilities between states, covariate effects on transition rates, or prevalence counts, are not aggregated over the hidden phases. Numerical estimation will be unstable when there is weak evidence for a two-phase sojourn distribution, that is, if the model is close to Markov. See This is an experimental feature, and some functions are not implemented. Please report any experiences of using this feature to the author! |
phase.inits |
Initial values for phase-type models. A list with one component for each "two-phased" state. Each component is itself a list of two elements. The first of these elements is a scalar defining the transition intensity from phase 1 to phase 2. The second element is a matrix, with one row for each potential destination state from the two-phased state, and two columns. The first column is the transition rate from phase 1 to the destination state, and the second column is the transition rate from phase 2 to the destination state. If there is only one destination state, then this may be supplied as a vector. In phase type models, the initial values for transition rates out of
non-phased states are taken from the |
cl |
Width of symmetric confidence intervals for maximum likelihood estimates, by default 0.95. |
fixedpars |
Vector of indices of parameters whose values will be fixed at their initial values during the optimisation. These are given in the order: transition intensities (reading across rows of the transition matrix), covariates on intensities (ordered by intensities within covariates), hidden Markov model parameters, including misclassification probabilities or parameters of HMM outcome distributions (ordered by parameters within states), hidden Markov model covariate parameters (ordered by covariates within parameters within states), initial state occupancy probabilities (excluding the first probability, which is fixed at one minus the sum of the others). If there are equality constraints on certain parameters, then
To fix all parameters, specify This can be useful for profiling likelihoods, and building complex models stage by stage. |
center |
If |
opt.method |
If "optim", "nlm" or "bobyqa", then the corresponding R
function will be used for maximum likelihood estimation.
If "fisher", then a specialised Fisher scoring method is used (Kalbfleisch
and Lawless, 1985) which can be faster than the generic methods, though less
robust. This is only available for Markov models with panel data
( |
hessian |
If If |
use.deriv |
If |
use.expm |
If |
analyticp |
By default, the likelihood for certain simpler 3, 4 and 5
state models is calculated using an analytic expression for the transition
probability (P) matrix. For all other models, matrix exponentiation is used
to obtain P. To revert to the original method of using the matrix
exponential for all models, specify |
na.action |
What to do with missing data: either |
... |
Optional arguments to the general-purpose optimisation routine,
It is often worthwhile to normalize the optimisation using
If 'false' convergence is reported and the standard errors cannot be
calculated due to a non-positive-definite Hessian, then consider tightening
the tolerance criteria for convergence. If the optimisation takes a long
time, intermediate steps can be printed using the For the Fisher scoring method, a |
Details
For full details about the methodology behind the msm package, refer to the PDF manual ‘msm-manual.pdf’ in the ‘doc’ subdirectory of the package. This includes a tutorial in the typical use of msm. The paper by Jackson (2011) in Journal of Statistical Software presents the material in this manual in a more concise form.
msm was designed for fitting continuous-time Markov models,
processes where transitions can occur at any time. These models are defined
by intensities, which govern both the time spent in the current state
and the probabilities of the next state. In discrete-time models,
transitions are known in advance to only occur at multiples of some time
unit, and the model is purely governed by the probability distributions of
the state at the next time point, conditionally on the state at the current
time. These can also be fitted in msm, assuming that there is a
continuous-time process underlying the data. Then the fitted transition
probability matrix over one time period, as returned by
pmatrix.msm(...,t=1)
is equivalent to the matrix that governs the
discrete-time model. However, these can be fitted more efficiently using
multinomial logistic regression, for example, using multinom
from the
R package nnet (Venables and Ripley, 2002).
For simple continuous-time multi-state Markov models, the likelihood is
calculated in terms of the transition intensity matrix Q
. When the
data consist of observations of the Markov process at arbitrary times, the
exact transition times are not known. Then the likelihood is calculated
using the transition probability matrix P(t) = \exp(tQ)
, where \exp
is the matrix exponential. If state i
is observed at time t
and state j
is observed at time u
,
then the contribution to the likelihood from this pair of observations is
the i,j
element of P(u - t)
. See, for example, Kalbfleisch and
Lawless (1985), Kay (1986), or Gentleman et al. (1994).
For hidden Markov models, the likelihood for an individual with k
observations is calculated directly by summing over the unknown state at
each time, producing a product of k
matrices. The calculation is a
generalisation of the method described by Satten and Longini (1996), and
also by Jackson and Sharples (2002), and Jackson et al. (2003).
There must be enough information in the data on each state to estimate each transition rate, otherwise the likelihood will be flat and the maximum will not be found. It may be appropriate to reduce the number of states in the model, the number of allowed transitions, or the number of covariate effects, to ensure convergence. Hidden Markov models, and situations where the value of the process is only known at a series of snapshots, are particularly susceptible to non-identifiability, especially when combined with a complex transition matrix. Choosing an appropriate set of initial values for the optimisation can also be important. For flat likelihoods, 'informative' initial values will often be required. See the PDF manual for other tips.
Value
To obtain summary information from models fitted by the
msm
function, it is recommended to use extractor functions
such as qmatrix.msm
, pmatrix.msm
,
sojourn.msm
, msm.form.qoutput
. These provide
estimates and confidence intervals for quantities such as transition
probabilities for given covariate values.
For advanced use, it may be necessary to directly use information stored in
the object returned by msm
. This is documented in the help
page msm.object
.
Printing a msm
object by typing the object's name at the command line
implicitly invokes print.msm
. This formats and prints the
important information in the model fit, and also returns that information in
an R object. This includes estimates and confidence intervals for the
transition intensities and (log) hazard ratios for the corresponding
covariates. When there is a hidden Markov model, the chief information in
the hmodel
component is also formatted and printed. This includes
estimates and confidence intervals for each parameter.
Author(s)
C. H. Jackson chris.jackson@mrc-bsu.cam.ac.uk
References
Jackson, C.H. (2011). Multi-State Models for Panel Data: The msm Package for R., Journal of Statistical Software, 38(8), 1-29. URL http://www.jstatsoft.org/v38/i08/.
Kalbfleisch, J., Lawless, J.F., The analysis of panel data under a Markov assumption Journal of the Americal Statistical Association (1985) 80(392): 863–871.
Kay, R. A Markov model for analysing cancer markers and disease states in survival studies. Biometrics (1986) 42: 855–865.
Gentleman, R.C., Lawless, J.F., Lindsey, J.C. and Yan, P. Multi-state Markov models for analysing incomplete disease history data with illustrations for HIV disease. Statistics in Medicine (1994) 13(3): 805–821.
Satten, G.A. and Longini, I.M. Markov chains with measurement error: estimating the 'true' course of a marker of the progression of human immunodeficiency virus disease (with discussion) Applied Statistics 45(3): 275-309 (1996)
Jackson, C.H. and Sharples, L.D. Hidden Markov models for the onset and progression of bronchiolitis obliterans syndrome in lung transplant recipients Statistics in Medicine, 21(1): 113–128 (2002).
Jackson, C.H., Sharples, L.D., Thompson, S.G. and Duffy, S.W. and Couto, E. Multi-state Markov models for disease progression with classification error. The Statistician, 52(2): 193–209 (2003)
Titman, A.C. and Sharples, L.D. Semi-Markov models with phase-type sojourn distributions. Biometrics 66, 742-752 (2009).
Venables, W.N. and Ripley, B.D. (2002) Modern Applied Statistics with S, second edition. Springer.
See Also
simmulti.msm
, plot.msm
,
summary.msm
, qmatrix.msm
,
pmatrix.msm
, sojourn.msm
.
Examples
### Heart transplant data
### For further details and background to this example, see
### Jackson (2011) or the PDF manual in the doc directory.
print(cav[1:10,])
twoway4.q <- rbind(c(-0.5, 0.25, 0, 0.25), c(0.166, -0.498, 0.166, 0.166),
c(0, 0.25, -0.5, 0.25), c(0, 0, 0, 0))
statetable.msm(state, PTNUM, data=cav)
crudeinits.msm(state ~ years, PTNUM, data=cav, qmatrix=twoway4.q)
cav.msm <- msm( state ~ years, subject=PTNUM, data = cav,
qmatrix = twoway4.q, deathexact = 4,
control = list ( trace = 2, REPORT = 1 ) )
cav.msm
qmatrix.msm(cav.msm)
pmatrix.msm(cav.msm, t=10)
sojourn.msm(cav.msm)