R: Fit HMMs to multiple imputation data

MIfitHMM {momentuHMM}

R Documentation

Fit HMMs to multiple imputation data

Description

Fit a (multivariate) hidden Markov model to multiple imputation data. Multiple imputation is a method for accommodating missing data, temporal-irregularity, or location measurement error in hidden Markov models, where pooled parameter estimates reflect uncertainty attributable to observation error.

Usage

MIfitHMM(miData, ...)

## Default S3 method:
MIfitHMM(
  miData,
  nSims,
  ncores = 1,
  poolEstimates = TRUE,
  alpha = 0.95,
  na.rm = FALSE,
  nbStates,
  dist,
  Par0,
  beta0 = NULL,
  delta0 = NULL,
  estAngleMean = NULL,
  circularAngleMean = NULL,
  formula = ~1,
  formulaDelta = NULL,
  stationary = FALSE,
  mixtures = 1,
  formulaPi = NULL,
  nlmPar = NULL,
  fit = TRUE,
  useInitial = FALSE,
  DM = NULL,
  userBounds = NULL,
  workBounds = NULL,
  betaCons = NULL,
  betaRef = NULL,
  deltaCons = NULL,
  mvnCoords = NULL,
  stateNames = NULL,
  knownStates = NULL,
  fixPar = NULL,
  retryFits = 0,
  retrySD = NULL,
  optMethod = "nlm",
  control = list(),
  prior = NULL,
  modelName = NULL,
  covNames = NULL,
  spatialCovs = NULL,
  centers = NULL,
  centroids = NULL,
  angleCovs = NULL,
  altCoordNames = NULL,
  method = "IS",
  parIS = 1000,
  dfSim = Inf,
  grid.eps = 1,
  crit = 2.5,
  scaleSim = 1,
  quad.ask = FALSE,
  force.quad = TRUE,
  fullPost = TRUE,
  dfPostIS = Inf,
  scalePostIS = 1,
  thetaSamp = NULL,
  ...
)

## S3 method for class 'hierarchical'
MIfitHMM(
  miData,
  nSims,
  ncores = 1,
  poolEstimates = TRUE,
  alpha = 0.95,
  na.rm = FALSE,
  hierStates,
  hierDist,
  Par0,
  hierBeta = NULL,
  hierDelta = NULL,
  estAngleMean = NULL,
  circularAngleMean = NULL,
  hierFormula = NULL,
  hierFormulaDelta = NULL,
  mixtures = 1,
  formulaPi = NULL,
  nlmPar = NULL,
  fit = TRUE,
  useInitial = FALSE,
  DM = NULL,
  userBounds = NULL,
  workBounds = NULL,
  betaCons = NULL,
  deltaCons = NULL,
  mvnCoords = NULL,
  knownStates = NULL,
  fixPar = NULL,
  retryFits = 0,
  retrySD = NULL,
  optMethod = "nlm",
  control = list(),
  prior = NULL,
  modelName = NULL,
  covNames = NULL,
  spatialCovs = NULL,
  centers = NULL,
  centroids = NULL,
  angleCovs = NULL,
  altCoordNames = NULL,
  method = "IS",
  parIS = 1000,
  dfSim = Inf,
  grid.eps = 1,
  crit = 2.5,
  scaleSim = 1,
  quad.ask = FALSE,
  force.quad = TRUE,
  fullPost = TRUE,
  dfPostIS = Inf,
  scalePostIS = 1,
  thetaSamp = NULL,
  ...
)

Arguments

`miData`	A `crwData` object, a `crwHierData` object, a `crwSim` object, a `crwHierSim` object, a list of `momentuHMMData` objects, or a list of `momentuHierHMMData` objects.
`...`	further arguments passed to or from other methods
`nSims`	Number of imputations in which to fit the HMM using `fitHMM`. If `miData` is a list of `momentuHMMData` objects, `nSims` cannot exceed the length of `miData`.
`ncores`	Number of cores to use for parallel processing. Default: 1 (no parallel processing).
`poolEstimates`	Logical indicating whether or not to calculate pooled parameter estimates across the `nSims` imputations using `MIpool`. Default: `TRUE`.
`alpha`	Significance level for calculating confidence intervals of pooled estimates when `poolEstimates=TRUE` (see `MIpool`). Default: 0.95.
`na.rm`	Logical indicating whether or not to exclude model fits with `NA` parameter estimates or standard errors from pooling when `poolEstimates=TRUE` (see `MIpool`). Default: FALSE.
`nbStates`	Number of states of the HMM. See `fitHMM`.
`dist`	A named list indicating the probability distributions of the data streams. See `fitHMM`.
`Par0`	A named list containing vectors of initial state-dependent probability distribution parameters for each data stream specified in `dist`. See `fitHMM`. `Par0` may also be a list of length `nSims`, where each element is a named list containing vectors of initial state-dependent probability distribution parameters for each imputation. Note that if `useInitial=TRUE` then `Par0` is ignored after the first imputation.
`beta0`	Initial matrix of regression coefficients for the transition probabilities. See `fitHMM`. `beta0` may also be a list of length `nSims`, where each element is an initial matrix of regression coefficients for the transition probabilities for each imputation.
`delta0`	Initial values for the initial distribution of the HMM. See `fitHMM`. `delta0` may also be a list of length `nSims`, where each element is the initial values for the initial distribution of the HMM for each imputation.
`estAngleMean`	An optional named list indicating whether or not to estimate the angle mean for data streams with angular distributions ('vm' and 'wrpcauchy'). See `fitHMM`.
`circularAngleMean`	An optional named list indicating whether to use circular-linear or circular-circular regression on the mean of circular distributions ('vm' and 'wrpcauchy') for turning angles. See `fitHMM`.
`formula`	Regression formula for the transition probability covariates. See `fitHMM`.
`formulaDelta`	Regression formula for the initial distribution. See `fitHMM`.
`stationary`	`FALSE` if there are time-varying covariates in `formula` or any covariates in `formulaDelta`. If `TRUE`, the initial distribution is considered equal to the stationary distribution. See `fitHMM`.
`mixtures`	Number of mixtures for the state transition probabilities (i.e. discrete random effects sensu DeRuiter et al. 2017). Default: `mixtures=1`.
`formulaPi`	Regression formula for the mixture distribution probabilities. See `fitHMM`.
`nlmPar`	List of parameters to pass to the optimization function `nlm` (which should be either `print.level`, `gradtol`, `stepmax`, `steptol`, `iterlim`, or `hessian` – see `nlm`'s documentation for more detail). For `print.level`, the default value of 0 means that no printing occurs, a value of 1 means that the first and last iterations of the optimization are detailed, and a value of 2 means that each iteration of the optimization is detailed. Ignored unless `optMethod="nlm"`.
`fit`	`TRUE` if the HMM should be fitted to the data, `FALSE` otherwise. See `fitHMM`. If `fit=FALSE` and `miData` is a `crwData` object, then `MIfitHMM` returns a list containing a `momentuHMMData` object (if `nSims=1`) or, if `nSims>1`, a `crwSim` object.
`useInitial`	Logical indicating whether or not to use parameter estimates for the first model fit as initial values for all subsequent model fits. If `ncores>1` then the first model is fit on a single core and then used as the initial values for all subsequent model fits on each core (in this case, the progress of the initial model fit can be followed using the `print.level` option in `nlmPar`). Default: FALSE. Ignored if `nSims<2`.
`DM`	An optional named list indicating the design matrices to be used for the probability distribution parameters of each data stream. See `fitHMM`.
`userBounds`	An optional named list of 2-column matrices specifying bounds on the natural (i.e, real) scale of the probability distribution parameters for each data stream. See `fitHMM`.
`workBounds`	An optional named list of 2-column matrices specifying bounds on the working scale of the probability distribution, transition probability, and initial distribution parameters. See `fitHMM`.
`betaCons`	Matrix of the same dimension as `beta0` composed of integers identifying any equality constraints among the t.p.m. parameters. See `fitHMM`.
`betaRef`	Numeric vector of length `nbStates` indicating the reference elements for the t.p.m. multinomial logit link. See `fitHMM`.
`deltaCons`	Matrix of the same dimension as `delta0` composed of integers identifying any equality constraints among the initial distribution working scale parameters. Ignored unless a formula is provided in `formulaDelta`. See `fitHMM`.
`mvnCoords`	Character string indicating the name of location data that are to be modeled using a multivariate normal distribution. For example, if `mu="mvnorm2"` was included in `dist` and (mu.x, mu.y) are location data, then `mvnCoords="mu"` needs to be specified in order for these data to be properly treated as locations in functions such as `plot.momentuHMM`, `plot.miSum`, `plot.miHMM`, `plotSpatialCov`, and `MIpool`.
`stateNames`	Optional character vector of length nbStates indicating state names.
`knownStates`	Vector of values of the state process which are known prior to fitting the model (if any). See `fitHMM`. If `miData` is a list of `momentuHMMData` objects, then `knownStates` can alternatively be a list of vectors containing the known values for the state process for each element of `miData`.
`fixPar`	An optional list of vectors indicating parameters which are assumed known prior to fitting the model. See `fitHMM`.
`retryFits`	Non-negative integer indicating the number of times to attempt to iteratively fit the model using random perturbations of the current parameter estimates as the initial values for likelihood optimization. See `fitHMM`.
`retrySD`	An optional list of scalars or vectors indicating the standard deviation to use for normal perturbations of each working scale parameter when `retryFits>0`. See `fitHMM`.
`optMethod`	The optimization method to be used. Can be “nlm” (the default; see `nlm`), “Nelder-Mead” (see `optim`), or “SANN” (see `optim`).
`control`	A list of control parameters to be passed to `optim` (ignored unless `optMethod="Nelder-Mead"` or `optMethod="SANN"`).
`prior`	A function that returns the log-density of the working scale parameter prior distribution(s). See `fitHMM`.
`modelName`	An optional character string providing a name for the fitted model. If provided, `modelName` will be returned in `print.momentuHMM`, `AIC.momentuHMM`, `AICweights`, and other functions.
`covNames`	Names of any covariates in `miData$crwPredict` (if `miData` is a `crwData` object; otherwise `covNames` is ignored). See `prepData`. If `miData` is a `crwData` object, any covariate in `miData$crwPredict` that is used in `formula`, `formulaDelta`, `formulaPi`, or `DM` must be included in `covNames`.
`spatialCovs`	List of raster layer(s) for any spatial covariates. See `prepData`.
`centers`	2-column matrix providing the x-coordinates (column 1) and y-coordinates (column 2) for any activity centers (e.g., potential centers of attraction or repulsion) from which distance and angle covariates will be calculated based on realizations of the position process. See `prepData`. Ignored unless `miData` is a `crwData` object.
`centroids`	List where each element is a data frame containing the x-coordinates ('x'), y-coordinates ('y'), and times (with user-specified name, e.g., ‘time’) for centroids (i.e., dynamic activity centers where the coordinates can change over time) from which distance and angle covariates will be calculated based on the location data. See `prepData`. Ignored unless `miData` is a `crwData` object.
`angleCovs`	Character vector indicating the names of any circular-circular regression angular covariates in `miData$crwPredict` that need conversion from standard direction (in radians relative to the x-axis) to turning angle (relative to previous movement direction) See `prepData`. Ignored unless `miData` is a `crwData` or `crwHierData` object.
`altCoordNames`	Character string indicating an alternative name for the returned location data. See `prepData`. Ignored unless `miData` is a `crwData` or `crwHierData` object.
`method`	Method for obtaining weights for movement parameter samples. See `crwSimulator`. Ignored unless `miData` is a `crwData` object.
`parIS`	Size of the parameter importance sample. See `crwSimulator`. Ignored unless `miData` is a `crwData` object.
`dfSim`	Degrees of freedom for the t approximation to the parameter posterior. See 'df' argument in `crwSimulator`. Ignored unless `miData` is a `crwData` object.
`grid.eps`	Grid size for `method="quadrature"`. See `crwSimulator`. Ignored unless `miData` is a `crwData` object.
`crit`	Criterion for deciding "significance" of quadrature points (difference in log-likelihood). See `crwSimulator`. Ignored unless `miData` is a `crwData` object.
`scaleSim`	Scale multiplier for the covariance matrix of the t approximation. See 'scale' argument in `crwSimulator`. Ignored unless `miData` is a `crwData` object.
`quad.ask`	Logical, for method='quadrature'. Whether or not the sampler should ask if quadrature sampling should take place. It is used to stop the sampling if the number of likelihood evaluations would be extreme. Default: FALSE. Ignored if `ncores>1`.
`force.quad`	A logical indicating whether or not to force the execution of the quadrature method for large parameter vectors. See `crwSimulator`. Default: TRUE. Ignored unless `miData` is a `crwData` object and method=``quadrature''.
`fullPost`	Logical indicating whether to draw parameter values as well to simulate full posterior. See `crwPostIS`. Ignored unless `miData` is a `crwData` object.
`dfPostIS`	Degrees of freedom for multivariate t distribution approximation to parameter posterior. See 'df' argument in `crwPostIS`. Ignored unless `miData` is a `crwData` object.
`scalePostIS`	Extra scaling factor for t distribution approximation. See 'scale' argument in `crwPostIS`. Ignored unless `miData` is a `crwData` object.
`thetaSamp`	If multiple parameter samples are available in `crwSimulator` objects, setting `thetaSamp=n` will use the nth sample. Defaults to the last. See `crwSimulator` and `crwPostIS`. Ignored unless `miData` is a `crwData` object.
`hierStates`	A hierarchical model structure `Node` for the states. See `fitHMM`.
`hierDist`	A hierarchical data structure `Node` for the data streams. See `fitHMM`.
`hierBeta`	A hierarchical data structure `Node` for the matrix of initial values for the regression coefficients of the transition probabilities at each level of the hierarchy ('beta'). See `fitHMM`.
`hierDelta`	A hierarchical data structure `Node` for the matrix of initial values for the regression coefficients of the initial distribution at each level of the hierarchy ('delta'). See `fitHMM`.
`hierFormula`	A hierarchical formula structure for the transition probability covariates for each level of the hierarchy. See `fitHMM`.
`hierFormulaDelta`	A hierarchical formula structure for the initial distribution covariates for each level of the hierarchy ('formulaDelta'). Default: `NULL` (no covariate effects and `fixPar$delta` is specified on the working scale). See `fitHMM`.

Details

miData can either be a crwData or crwHierData object (as returned by crawlWrap), a crwSim or crwHierSim object (as returned by MIfitHMM when fit=FALSE), or a list of momentuHMMData or momentuHierHMMData objects (e.g., each element of the list as returned by prepData).

If miData is a crwData (or crwHierData) object, MIfitHMM uses a combination of crwSimulator, crwPostIS, prepData, and fitHMM to draw nSims realizations of the position process and fit the specified HMM to each imputation of the data. The vast majority of MIfitHMM arguments are identical to the corresponding arguments from these functions.

If miData is a crwData or crwHierData object, nSims determines both the number of realizations of the position process to draw (using crwSimulator and crwPostIS) as well as the number of HMM fits.

If miData is a crwSim (or crwHierSim) object or a list of momentuHMMData (or momentuHierHMMData) object(s), the specified HMM will simply be fitted to each of the momentuHMMData (or momentuHierHMMData) objects and all arguments related to crwSimulator, crwPostIS, or prepData are ignored.

Value

If nSims>1, poolEstimates=TRUE, and fit=TRUE, a miHMM object, i.e., a list consisting of:

`miSum`	`miSum` object returned by `MIpool`.
`HMMfits`	List of length `nSims` comprised of `momentuHMM` objects.

If poolEstimates=FALSE and fit=TRUE, a list of length nSims consisting of momentuHMM objects is returned.

However, if fit=FALSE and miData is a crwData object, then MIfitHMM returns a crwSim object, i.e., a list containing miData (a list of momentuHMMData objects) and crwSimulator (a list of crwSimulator objects),and most other arguments related to fitHMM are ignored.

References

Hooten M.B., Johnson D.S., McClintock B.T., Morales J.M. 2017. Animal Movement: Statistical Models for Telemetry Data. CRC Press, Boca Raton.

McClintock B.T. 2017. Incorporating telemetry error into hidden Markov movement models using multiple imputation. Journal of Agricultural, Biological, and Environmental Statistics.

Examples


# Don't run because it takes too long on a single core
## Not run: 
# extract simulated obsData from example data
obsData <- miExample$obsData

# error ellipse model
err.model <- list(x= ~ ln.sd.x - 1, y =  ~ ln.sd.y - 1, rho =  ~ error.corr)

# create crwData object by fitting crwMLE models to obsData and predict locations 
# at default intervals for both individuals
crwOut <- crawlWrap(obsData=obsData,
         theta=c(4,0),fixPar=c(1,1,NA,NA),
         err.model=err.model)

# HMM specifications
nbStates <- 2
stepDist <- "gamma"
angleDist <- "vm"
mu0 <- c(20,70)
sigma0 <- c(10,30)
kappa0 <- c(1,1)
stepPar0 <- c(mu0,sigma0)
anglePar0 <- c(-pi/2,pi/2,kappa0)
formula <- ~cov1+cos(cov2)
nbCovs <- 2
beta0 <- matrix(c(rep(-1.5,nbStates*(nbStates-1)),rep(0,nbStates*(nbStates-1)*nbCovs)),
                nrow=nbCovs+1,byrow=TRUE)

# first fit HMM to best predicted position process
bestData<-prepData(crwOut,covNames=c("cov1","cov2"))
bestFit<-fitHMM(bestData,
                nbStates=nbStates,dist=list(step=stepDist,angle=angleDist),
                Par0=list(step=stepPar0,angle=anglePar0),beta0=beta0,
                formula=formula,estAngleMean=list(angle=TRUE))
            
print(bestFit)

# extract estimates from 'bestFit'
bPar0 <- getPar(bestFit)

# Fit nSims=5 imputations of the position process
miFits<-MIfitHMM(miData=crwOut,nSims=5,
                  nbStates=nbStates,dist=list(step=stepDist,angle=angleDist),
                  Par0=bPar0$Par,beta0=bPar0$beta,delta0=bPar0$delta,
                  formula=formula,estAngleMean=list(angle=TRUE),
                  covNames=c("cov1","cov2"))

# print pooled estimates
print(miFits)

## End(Not run)

[Package momentuHMM version 1.5.5 Index]