R: hindcast

hindcast_pomdp {sarsop}

R Documentation

hindcast_pomdp

Description

Compare historical actions to what pomdp recommendation would have been.

Usage

hindcast_pomdp(
  transition,
  observation,
  reward,
  discount,
  obs,
  action,
  state_prior = rep(1, dim(observation)[[1]])/dim(observation)[[1]],
  alpha = NULL,
  ...
)

Arguments

`transition`	Transition matrix, dimension n_s x n_s x n_a
`observation`	Observation matrix, dimension n_s x n_z x n_a
`reward`	reward matrix, dimension n_s x n_a
`discount`	the discount factor
`obs`	a given sequence of observations
`action`	the corresponding sequence of actions
`state_prior`	initial belief state, optional, defaults to uniform over states
`alpha`	the matrix of alpha vectors returned by `sarsop`
`...`	additional arguments to `appl`.

Value

a list, containing: a data frame with columns for time, obs, action, and optimal action, and an array containing the posterior belief distribution at each time t

Examples

m <- fisheries_matrices()
 ## Takes > 5s
if(assert_has_appl()){
alpha <- sarsop(m$transition, m$observation, m$reward, 0.95, precision = 10)
sim <- hindcast_pomdp(m$transition, m$observation, m$reward, 0.95,
                     obs = rnorm(21, 15, .1), action = rep(1, 21),
                     alpha = alpha)

}

[Package sarsop version 0.6.15 Index]