policy_graph {pomdp} | R Documentation |
POMDP Policy Graphs
Description
The function creates a POMDP policy graph for converged POMDP solution and the policy tree for a finite-horizon solution. The graph is represented as an igraph object.
Usage
policy_graph(
x,
belief = NULL,
show_belief = FALSE,
state_col = NULL,
simplify_observations = FALSE,
remove_unreachable_nodes = FALSE,
...
)
Arguments
x |
object of class POMDP containing a solved and converged POMDP problem. |
belief |
the initial belief is used to mark the initial belief state in the
grave of a converged solution and to identify the root node in a policy graph for a finite-horizon solution.
If |
show_belief |
logical; show estimated belief proportions as a pie chart or color in each node? |
state_col |
colors used to represent the belief over the states in each node. Only used if |
simplify_observations |
combine parallel observation arcs into a single arc. |
remove_unreachable_nodes |
logical; remove nodes that are not reachable from the start state? Currently only implemented for policy trees for unconverged finite-time horizon POMDPs. |
... |
parameters are passed on to |
Details
Each policy graph node is represented by an alpha vector specifying a hyper plane segment. The convex hull of the set of hyperplanes represents the the value function. The policy specifies for each node an optimal action which is printed together with the node ID inside the node. The arcs are labeled with observations. Infinite-horizon converged solutions from a single policy graph. For finite-horizon solution a policy tree is produced. The levels of the tree and the first number in the node label represent the epochs.
The parameters show_belief
, remove_unreachable_nodes
, and simplify_observations
are
used by plot_policy_graph()
(see there for details) to reduce clutter and make the visualization more readable.
These options are disabled by default for policy_graph()
.
Value
returns the policy graph as an igraph object.
See Also
Other policy:
estimate_belief_for_nodes()
,
optimal_action()
,
plot_belief_space()
,
plot_policy_graph()
,
policy()
,
projection()
,
reward()
,
solve_POMDP()
,
solve_SARSOP()
,
value_function()
Examples
data("Tiger")
### Policy graphs for converged solutions
sol <- solve_POMDP(model = Tiger)
sol
policy_graph(sol)
## visualization
plot_policy_graph(sol)
### Policy trees for finite-horizon solutions
sol <- solve_POMDP(model = Tiger, horizon = 4, method = "incprune")
policy_graph(sol)
plot_policy_graph(sol)
# Note: the first number in the node id is the epoch.