# createMDP

Create Markov decision process model

## Description

## Examples

### Create MDP Model

Create an MDP model with eight states and two possible actions.

MDP = createMDP(8,["up";"down"]);

Specify the state transitions and their associated rewards.

% State 1 Transition and Reward MDP.T(1,2,1) = 1; MDP.R(1,2,1) = 3; MDP.T(1,3,2) = 1; MDP.R(1,3,2) = 1; % State 2 Transition and Reward MDP.T(2,4,1) = 1; MDP.R(2,4,1) = 2; MDP.T(2,5,2) = 1; MDP.R(2,5,2) = 1; % State 3 Transition and Reward MDP.T(3,5,1) = 1; MDP.R(3,5,1) = 2; MDP.T(3,6,2) = 1; MDP.R(3,6,2) = 4; % State 4 Transition and Reward MDP.T(4,7,1) = 1; MDP.R(4,7,1) = 3; MDP.T(4,8,2) = 1; MDP.R(4,8,2) = 2; % State 5 Transition and Reward MDP.T(5,7,1) = 1; MDP.R(5,7,1) = 1; MDP.T(5,8,2) = 1; MDP.R(5,8,2) = 9; % State 6 Transition and Reward MDP.T(6,7,1) = 1; MDP.R(6,7,1) = 5; MDP.T(6,8,2) = 1; MDP.R(6,8,2) = 1; % State 7 Transition and Reward MDP.T(7,7,1) = 1; MDP.R(7,7,1) = 0; MDP.T(7,7,2) = 1; MDP.R(7,7,2) = 0; % State 8 Transition and Reward MDP.T(8,8,1) = 1; MDP.R(8,8,1) = 0; MDP.T(8,8,2) = 1; MDP.R(8,8,2) = 0;

Specify the terminal states of the model.

MDP.TerminalStates = ["s7";"s8"];

## Input Arguments

`states`

— Model states

positive integer | string vector

Model states, specified as one of the following:

Positive integer — Specify the number of model states. In this case, each state has a default name, such as

`"s1"`

for the first state.String vector — Specify the state names. In this case, the total number of states is equal to the length of the vector.

`actions`

— Model actions

positive integer | string vector

Model actions, specified as one of the following:

Positive integer — Specify the number of model actions. In this case, each action has a default name, such as

`"a1"`

for the first action.String vector — Specify the action names. In this case, the total number of actions is equal to the length of the vector.

## Output Arguments

`MDP`

— MDP model

`GenericMDP`

object

MDP model, returned as a `GenericMDP`

object with the following
properties.

`CurrentState`

— Name of the current state

string

Name of the current state, specified as a string.

`States`

— State names

string vector

State names, specified as a string vector with length equal to the number of states.

`Actions`

— Action names

string vector

Action names, specified as a string vector with length equal to the number of actions.

`T`

— State transition matrix

3D array

State transition matrix, specified as a 3-D array, which determines the
possible movements of the agent in an environment. State transition matrix
`T`

is a probability matrix that indicates how likely the agent
will move from the current state `s`

to any possible next state
`s'`

by performing action `a`

.
`T`

is an
*S*-by-*S*-by-*A* array,
where *S* is the number of states and *A* is the
number of actions. It is given by:

$$T\left(s,s\text{'},a\right)\text{}=\text{}probability\left(s\text{'}|s,a\right).$$

The sum of the transition probabilities out from a nonterminal state
`s`

following a given action must sum up to one. Therefore, all
stochastic transitions out of a given state must be specified at the same
time.

For example, to indicate that in state `1`

following action
`4`

there is an equal probability of moving to states
`2`

or `3`

, use the
following:

MDP.T(1,[2 3],4) = [0.5 0.5];

You can also specify that, following an action, there is some probability of remaining in the same state. For example:

MDP.T(1,[1 2 3 4],1) = [0.25 0.25 0.25 0.25];

`R`

— Reward transition matrix

3D array

Reward transition matrix, specified as a 3-D array, which determines how much
reward the agent receives after performing an action in the environment.
`R`

has the same shape and size as state transition matrix
`T`

. The reward for moving from state `s`

to
state `s'`

by performing action `a`

is given by:

$$r\text{}=\text{}R\left(s,s\text{'},a\right).$$

`TerminalStates`

— Terminal state names in the grid world

string vector

Terminal state names in the grid world, specified as a string vector of state names.

## Version History

**Introduced in R2019a**

## Open Example

You have a modified version of this example. Do you want to open this example with your edits?

## MATLAB Command

You clicked a link that corresponds to this MATLAB command:

Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.

# Select a Web Site

Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .

You can also select a web site from the following list:

## How to Get Best Site Performance

Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.

### Americas

- América Latina (Español)
- Canada (English)
- United States (English)

### Europe

- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)

- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)