solve_SARSOP {pomdp}R Documentation

Solve a POMDP Problem using SARSOP

Description

This function uses the C++ implementation of the SARSOP algorithm by Kurniawati, Hsu and Lee (2008) interfaced in package sarsop to solve infinite horizon problems that are formulated as partially observable Markov decision processes (POMDPs). The result is an optimal or approximately optimal policy.

Usage

solve_SARSOP(
  model,
  horizon = Inf,
  discount = NULL,
  terminal_values = NULL,
  method = "sarsop",
  digits = 7,
  parameter = NULL,
  verbose = FALSE
)

Arguments

model

a POMDP problem specification created with POMDP(). Alternatively, a POMDP file or the URL for a POMDP file can be specified.

horizon

SARSOP only supports Inf.

discount

discount factor in range [0, 1]. If NULL, then the discount factor specified in model will be used.

terminal_values

NULL. SARSOP does not use terminal values.

method

string; there is only one method available called "sarsop".

digits

precision used when writing POMDP files (see write_POMDP()).

parameter

a list with parameters passed on to the function sarsop::pomdpsol() in package sarsop.

verbose

logical, if set to TRUE, the function provides the output of the solver in the R console.

Value

The solver returns an object of class POMDP which is a list with the model specifications ('model'), the solution ('solution'), and the solver output ('solver_output').

Author(s)

Michael Hahsler

References

Carl Boettiger, Jeroen Ooms and Milad Memarzadeh (2020). sarsop: Approximate POMDP Planning Software. R package version 0.6.6. https://CRAN.R-project.org/package=sarsop

H. Kurniawati, D. Hsu, and W.S. Lee (2008). SARSOP: Efficient point-based POMDP planning by approximating optimally reachable belief spaces. In Proc. Robotics: Science and Systems.

See Also

Other policy: estimate_belief_for_nodes(), optimal_action(), plot_belief_space(), plot_policy_graph(), policy(), policy_graph(), projection(), reward(), solve_POMDP(), value_function()

Other solver: solve_MDP(), solve_POMDP()

Other POMDP: MDP2POMDP, POMDP(), accessors, actions(), add_policy(), plot_belief_space(), projection(), reachable_and_absorbing, regret(), sample_belief_space(), simulate_POMDP(), solve_POMDP(), transition_graph(), update_belief(), value_function(), write_POMDP()

Examples

## Not run: 
# Solving the simple infinite-horizon Tiger problem with SARSOP
# You need to install package "sarsop"
data("Tiger")
Tiger

sol <- solve_SARSOP(model = Tiger)
sol

# look at solver output
sol$solver_output

# policy (value function (alpha vectors), optimal action and observation dependent transitions)
policy(sol)

# value function
plot_value_function(sol, ylim = c(0,20))

# plot the policy graph
plot_policy_graph(sol)

# reward of the optimal policy
reward(sol)

# Solve a problem specified as a POMDP file. The timeout is set to 10 seconds.
sol <- solve_SARSOP("http://www.pomdp.org/examples/cheese.95.POMDP", parameter = list(timeout = 10))
sol

## End(Not run)


[Package pomdp version 1.2.3 Index]