mdp_example_forest {MDPtoolbox} | R Documentation |
Generates a MDP for a simple forest management problem
Description
Generates a simple MDP example of forest management problem
Usage
mdp_example_forest(S, r1, r2, p)
Arguments
S |
(optional) number of states. S is an integer greater than 0. By default, S is set to 3. |
r1 |
(optional) reward when forest is in the oldest state and action Wait is performed. r1 is a real greater than 0. By default, r1 is set to 4. |
r2 |
(optional) reward when forest is in the oldest state and action Cut is performed r2 is a real greater than 0. By default, r2 is set to 2. |
p |
(optional) probability of wildfire occurence. p is a real in ]0, 1[. By default, p is set to 0.1. |
Details
mdp_example_forest generates a transition probability (SxSxA) array P and a reward (SxA) matrix R that model the following problem. A forest is managed by two actions: Wait and Cut. An action is decided each year with first the objective to maintain an old forest for wildlife and second to make money selling cut wood. Each year there is a probability p that a fire burns the forest.
Here is the modelisation of this problem. Let 1, ... S be the states of the forest. the Sth state being the oldest. Let Wait be action 1 and Cut action 2. After a fire, the forest is in the youngest state, that is state 1.
The transition matrix P of the problem can then be defined as follows:
P(,,1) = \left[
\begin{array}{llllll}
p & 1-p & 0 & \ldots & \ldots & 0 \\
p & 1-p & 0 & \ldots & \ldots & 0 \\
\vdots & \vdots & \vdots & \ddots & \ddots & \vdots \\
\vdots & \vdots & \vdots & \ddots & \ddots & 0 \\
\vdots & \vdots & \vdots & \ddots & \ddots & 1-p \\
p & 0 & 0 & \ldots & 0 & 1-p \\
\end{array}
\right]
P(,,2) = \left[
\begin{array}{lllll}
1 & 0 & \ldots & \ldots & 0 \\
\vdots & \vdots & \ddots & \ddots & \vdots \\
\vdots & \vdots & \ddots & \ddots & \vdots \\
1 & 0 & \ldots & \ldots & 0 \\
\end{array}
\right]
The reward matrix R is defined as follows:
R(,1) = \left[
\begin{array}{l}
0 \\
\vdots \\
\vdots \\
\vdots \\
0 \\
r1 \\
\end{array}
\right]
R(,2) = \left[
\begin{array}{l}
0 \\
1 \\
\vdots \\
\vdots \\
1 \\
r2 \\
\end{array}
\right]
Value
P |
transition probability array. P is a [S,S,A] array. |
R |
reward matrix. R is a [S,A] matrix |
Examples
mdp_example_forest()