R: sarsop

sarsop {sarsop}

R Documentation

sarsop

Description

sarsop wraps the tasks of writing the pomdpx file defining the problem, running the pomdsol (SARSOP) algorithm in C++, and then reading the resulting policy file back into R. The returned alpha vectors and alpha_action information is then transformed into a more generic, user-friendly representation as a matrix whose columns correspond to actions and rows to states. This function can thus be used at the heart of most pomdp applications.

Usage

sarsop(
  transition,
  observation,
  reward,
  discount,
  state_prior = rep(1, dim(observation)[[1]])/dim(observation)[[1]],
  verbose = TRUE,
  log_dir = tempdir(),
  log_data = NULL,
  cache = TRUE,
  ...
)

Arguments

`transition`	Transition matrix, dimension n_s x n_s x n_a
`observation`	Observation matrix, dimension n_s x n_z x n_a
`reward`	reward matrix, dimension n_s x n_a
`discount`	the discount factor
`state_prior`	initial belief state, optional, defaults to uniform over states
`verbose`	logical, should the function include a message with pomdp diagnostics (timings, final precision, end condition)
`log_dir`	pomdpx and policyx files will be saved here, along with a metadata file
`log_data`	a data.frame of additional columns to include in the log, such as model parameters. A unique id value for each run can be provided as one of the columns, otherwise, a globally unique id will be generated.
`cache`	should results from the log directory be cached? Default TRUE. Identical functional calls will quickly return previously cached alpha vectors from file rather than re-running.
`...`	additional arguments to `appl`.

Value

a matrix of alpha vectors. Column index indicates action associated with the alpha vector, (1:n_actions), rows indicate system state, x. Actions for which no alpha vector was found are included as all -Inf, since such actions are not optimal regardless of belief, and thus have no corresponding alpha vectors in alpha_action list.

Examples

 ## Takes > 5s
## Use example code to generate matrices for pomdp problem:
source(system.file("examples/fisheries-ex.R", package = "sarsop"))
alpha <- sarsop(transition, observation, reward, discount, precision = 10)
compute_policy(alpha, transition, observation, reward)

[Package sarsop version 0.6.15 Index]