reglhmm {eglhmm} | R Documentation |
Simulate data from a hidden generalised linear Markov model.
Description
Takes a specification of the model and simulates the data from
that model. The model may be specified in terms of the individual
components of that model (the default method). The components
include a data frame that provides the predictor variables,
and various parameters of the model. For the "eglhmm"
method the model is specified as a fitted model, an object of
class "eglhmm"
.
Usage
reglhmm(x,...)
## Default S3 method:
reglhmm(x, formula, response, cells=NULL, data=NULL, nobs=NULL,
distr=c("Gaussian","Poisson","Binomial","Dbd","Multinom"),
phi, Rho, sigma, size, ispd=NULL, ntop=NULL, zeta=NULL,
missFrac = 0, fep=NULL,
contrast=c("treatment","sum","helmert"),...)
## S3 method for class 'eglhmm'
reglhmm(x, missFrac = NULL, ...)
Arguments
x |
For the default method, the transition probability matrix of
the hidden Markov chain. For the |
formula |
The formula specifying the generalised linear model from which data
are to be simulated. Note that the predictor variables in
this formula must include a factor It is advisable to use a formula specified in the manner
|
response |
A character vector of length 2, specifying
the names of the responses. Ignored unless |
cells |
A character vector specifying the names of the factors which
determine the “cells” of the model. These factors must be
columns of the data frame |
data |
A data frame containing the predictor variables referred to by
|
nobs |
Integer scalar. The number of observations to be generated in
the setting in which the generalised linear model in question is
vacuous. Ignored if |
distr |
Character string specifying the distribution of the “emissions” from the model, i.e., of the observations. This distribution determines “emission probabilities”. |
phi |
A numeric vector specifying the coefficients of the linear
predictor of the generalised linear model. The length of
|
Rho |
A matrix, or a list of two matrices or a three dimensional
array specifying the emissions probabilities for a multinomial
distribution. Ignored unless |
sigma |
A numeric vector of length equal to the number of states.
Its |
size |
Integer scalar. The number of trials (sample size) from which
the number of “successes” are counted, in the context of
the binomial distribution. (I.e. the |
ispd |
An optional numeric vector specifying the initial state probability
distribution of the model. If |
ntop |
Integer scalar, strictly greater than 1. The maximum possible
value of the db distribution. See |
zeta |
Logical scalar. Should zero origin indexing be used?
I.e. should the range of values of the db distribution be taken to
be |
missFrac |
A non-negative scalar, less than 1. Data will be randomly set
equal to |
fep |
A list of length 1 or 2. The first entry of this
list is a logical scalar. If this is |
contrast |
A character string, one of “treatment”, “helmert” or “sum”,
specifying what contrast (for unordered factors) to use in
constructing the design matrix. (The contrast for ordered factors,
which is has no relevance in this context, is left at it default
value of |
... |
Not used. |
Value
A data frame with the same columns as those of data
and an added column, whose name is determined from formula
,
containing the simulated response
Remark
Although this documentation refers to “generalised linear models”, the only such models currently (13/02/2024) available are the Gaussian model with the identity link, the Poisson model, with the log link, and the Binomial model with the logit link. The Multinomial model, which is also available, is not exactly a generalised linear model; it might be thought of as an “extended” generalised linear model. Other models may be added at a future date.
Author(s)
Rolf Turner rolfturner@posteo.net
References
T. Rolf Turner, Murray A. Cameron, and Peter J. Thomson (1998). Hidden Markov chains in generalized linear models. Canadian Journal of Statististics 26, pp. 107 – 125, DOI: https://doi.org/10.2307/3315677.
Rolf Turner (2008). Direct maximization of the likelihood of a hidden Markov model. Computational Statistics and Data Analysis 52, pp. 4147 – 4160, DOI: https://doi.org/10.1016/j.csda.2008.01.029
See Also
fitted.eglhmm()
bcov()
Examples
loc4 <- c("LngRf","BondiE","BondiOff","MlbrOff")
SCC4 <- SydColCount[SydColCount$locn %in% loc4,]
SCC4$locn <- factor(SCC4$locn) # Get rid of unused levels.
rownames(SCC4) <- 1:nrow(SCC4)
Tpm <- matrix(c(0.91,0.09,0.36,0.64),byrow=TRUE,ncol=2)
Phi <- c(0,log(5),-0.34,0.03,-0.32,0.14,-0.05,-0.14)
# The "state effects" are 1 and 5.
Dat <- SCC4[,1:3]
fmla <- y~0+state+locn+depth
cells <- c("locn","depth")
# The default method.
X <- reglhmm(Tpm,formula=fmla,cells=cells,data=Dat,distr="P",phi=Phi,
miss.frac=0.75,contrast="sum")
# The "eglhmm" method.
fit <- eglhmm(y~locn+depth,data=SCC4,cells=cells,K=2,
verb=TRUE,distr="P")
Y <- reglhmm(fit)
# Vacuous generalised linear model.
Z <- reglhmm(Tpm,formula=y~0+state,nobs=300,distr="P",phi=log(c(2,7)))
# The "state effects" are 2 and 7.