R: Model object set up for non-homogeneous Markov models

model.nhm {nhm}

R Documentation

Model object set up for non-homogeneous Markov models

Description

Sets up a model object in preparation for fitting a non-homogeneous Markov or misclassification type hidden Markov multi-state model.

Usage

model.nhm(formula,  data, subject, covariates=NULL,  type, trans,
nonh=NULL, covm=NULL, centre_time=NULL, emat=NULL, ecovm=NULL,
firstobs=NULL, initp=NULL, initp_value=NULL, initcovm=NULL,
splinelist=NULL,degrees=NULL,censor=NULL,
censor.states=NULL,death=FALSE,death.states=NULL,intens=NULL)

Arguments

`formula`	A formula identifying the state and time variables within `data`, for instance `state ~ time` would imply the variables are `state` and `time`, respectively.
`data`	data frame containing the observed states, observation times, subject identifiers and covariates. Should include initial observation/recruitment times.
`subject`	Name of the subject identifier variable within the `data` data frame.
`covariates`	A character vector giving the variable names of the covariates to be used in the model
`type`	type of intensity model. `'bespoke'`: user supplied, `'weibull'`: Model with Weibull transition intensity functions with respect to time. `'gompertz'`: Gompertz/exponential growth intensity models. `'bspline'`: b-spline function of time model.
`trans`	Square matrix of viable transitions with dimension equal to the number of states. Impossible transitions should be 0. Others should be labelled consecutively from 1. Labelling transitions with the same value assumes the parameter is shared.
`nonh`	Square matrix to indicate non-homogeneous transitions with dimension equal to the number of states. Impossible transitions or homogeneous transitions should be 0. Otherwise label consecutively from 1. Labelling the same value implies the same non-homogeneity. Not required if `type='bespoke'`. If otherwise omitted a time homogeneous model is fitted.
`covm`	Either a named list of nstate x nstates indicating the covariate effects with respect to a particular covariate OR an nstate x nstate x ncov array to indicate covariate effects, where ncov is the length of the supplied `covariates` vector. 0 implies no covariate effect. Otherwise label consecutively from 1. Labelling the same value implies a common covariate effect. Not required if `type='bespoke'`.
`centre_time`	Value by which to centre time for Gompertz models. By default the model is of the form `h(t) = exp(a + bt)`, centring by `c` reparametrizes this to `h(t) = exp(a + b(t - c))`. Centring can improve the convergence of optimization routines.
`emat`	Square matrix of viable misclassification errors. Must be supplied if the model has misclassification. Impossible errors should be 0. Others should be labelled consecutively. Labelling the same implies a common parameter on the logit scale.
`ecovm`	Either a named list of nstate x nstates indicating the covariate effects with respect to a particular covariate OR an nstate x nstate x ncov array to indicate indicate covariate effects on misclassification, where ncov is the length of the supplied `covariates` vector. 0 implies no covariate effect. Otherwise label consecutively from 1. Labelling the same value implies a common covariate effect.
`firstobs`	For misclassification models: Form of the first observation for each subject in the data. `'exact'`: Initial state not subject to misclassification (default) `'absent'`: No initial state. First observation is ignored and state occupied is based on initial probabilities model. `'misc'`: Initial state is subject to misclassification.
`initp`	For misclassification models: Numerical vector of length nstate to define the model for the initial probabilities. The first entry should be zero. Should be numbered consecutively. If the same number is repeated implies a shared parameter. If absent then initial probabilities taken from `initp_value`.
`initp_value`	For misclassification models where `firstobs="absent"` or `"misc"`: Fixed value of initial probabilities is missing. Should be a numerical vector of length nstate. Ignored if `initp` is present. Default if absent is `c(1,0,...)`.
`initcovm`	For misclassification models; Either a named list of vectors of length nstate, or an nstate x ncovs matrix to specify the covariate effects on misclassification probabilities. 0 implies no covariate effect. Otherwise label consecutively from 1. Labelling the same value implies a common covariate effect.
`splinelist`	For bspline models only: list (of length equal to the number of nonhomogeneous transitions) of knot point locations including the boundary knots.
`degrees`	For bspline models only: optional vector (of length equal to number of nonhomogeneous transitions) of degrees of splines. Defaults to 3 if not specified.
`censor`	Vector of censor state indicators in the data. Note that censored observations can only occur as the last observation for a subject.
`censor.states`	List of vectors of states in which subject occupy if censored by corresponding censor state indicator. Can be a vector if only one censor state marker is present.
`death`	Setting `TRUE` assumes exact death times are present in the data set
`death.states`	Vector specifying which states have exact death times. Should only correspond to absorbing states.
`intens`	Optional supplied intensity function. See below for details.

Details

The function allows the model to be specified and creates the metadata needed to use nhm to fit it. The function automatically generates a function intens which defines the generator matrix of the model and its first derivatives as a function of time t, covariates z and the underlying parameters x, provided the model is of Weibull, Gompertz or B-spline type.

Alternatively, type='bespoke' can be chosen. In which case it is necessary for the user to supply a function intens. This must have arguments t, z, x and return a list consisting of a component q which is the nstate x nstate generator matrix, and dq which is the nstate x nstate x nparQ first derivatives of the generator matrix with respect to the parameters of the model, where nparQ is the number of parameters in the model for the intensities only (excludes parameters for the emat or initp). Since unrestricted maximization is used so the parameters must take values on -Inf, Inf. Note that using a hard-coded version via type='bespoke' can be substantially faster than the analogous automatically generated function, so for large models or datasets it may be advantageous to code directly.

For misclassification type models, the function also automatically creates functions emat_nhm and initp_nhm, to allow the misclassification probability matrix and the initial probability vectors and their derivatives to be calculated at given parameter and covariate values. In each case, a multinomial logistic regression is used for the covariate model. User specification of the misclassification probability function or initial probability vector is not currently possible.

Value

Returns an object of class nhm_model containing the necessary metadata needed to use nhm to fit the model.

Author(s)

Andrew Titman a.titman@lancaster.ac.uk