Reinforcement Learning Toolbox

Design and train policies using reinforcement learning

Reinforcement Learning Toolbox provides an app, functions, and a Simulink block for training policies using reinforcement learning algorithms, including DQN, PPO, SAC, and DDPG. You can use these policies to implement controllers and decision-making algorithms for complex applications such as resource allocation, robotics, and autonomous systems.

The toolbox lets you represent policies and value functions using deep neural networks or look-up tables and train them through interactions with environments modeled in MATLAB or Simulink. You can evaluate the single- or multi-agent reinforcement learning algorithms provided in the toolbox or develop your own. You can experiment with hyperparameter settings, monitor training progress, and simulate trained agents either interactively through the app or programmatically. To improve training performance, simulations can be run in parallel on multiple CPUs, GPUs, computer clusters, and the cloud (with Parallel Computing Toolbox and MATLAB Parallel Server).

Through the ONNX™ model format, existing policies can be imported from deep learning frameworks such as TensorFlow™ Keras and PyTorch (with Deep Learning Toolbox). You can generate optimized C, C++, and CUDA^® code to deploy trained policies on microcontrollers and GPUs. The toolbox includes reference examples to help you get started.

What Is Reinforcement Learning Toolbox?

Reinforcement Learning Agents

Create model-free and model-based reinforcement learning agents using popular algorithms such as DQN, PPO, and SAC. Alternatively, develop your own custom algorithms with provided templates. Use RL Agent block to bring your agents into Simulink.

Reinforcement Learning, Part 3: Policies and Learning Algorithms (17:51)

Documentation

Reinforcement Learning Designer App

Interactively design, train, and simulate reinforcement learning agents. Export trained agents to MATLAB for further use and deployment.

Documentation | Example

Reward Signals

Create reward signals that measure how successful the agent is at achieving its goal. Automatically generate reward functions from control specifications defined in Model Predictive Control Toolbox or Simulink Design Optimization.

Documentation

Policy Representation

Get started quickly by using neural network architectures suggested by the toolbox. Alternatively, explore lookup tables, or define neural network policies manually, with Deep Learning Toolbox layers, and Deep Network Designer app.

Documentation

Reinforcement Learning Training

Train agents through interactions with an environment or using existing data. Explore single- and multi-agent training. Log and view training data, and monitor progress as you go.

An Introduction to Multi-Agent Reinforcement Learning (14:43)

Documentation | Example

Distributed Computing

Speed up training using multicore computers, cloud resources, or compute clusters with Parallel Computing Toolbox and MATLAB Parallel Server. Leverage GPUs to accelerate operations such as gradient computation and prediction.

Documentation | Example

Screenshot of a Simulink model for a quadruped robot.

Environment Modeling

Model environments that interact seamlessly with the reinforcement learning agents using MATLAB and Simulink. Interface with third-party modeling tools.

Getting Started with Reinforcement Learning (9:30)

Code Generation and Deployment

Automatically generate C/C++ and CUDA code from trained policies for deployment to embedded devices. Use MATLAB Compiler and MATLAB Production Server to deploy trained policies to production systems as standalone applications, C/C++ shared libraries, and more.

Documentation | Example

Reference Examples

Design controllers and decision-making algorithms for robotics, automated driving, calibration, scheduling, and other applications. Consult our reference examples to get started quickly.

Examples

Product Resources:

Documentation Examples Videos Technical articles Functions Blocks Requirements Release notes

“5G is a critical infrastructure that we must protect from adversarial attacks. Reinforcement Learning Toolbox allows us to quickly assess 5G vulnerabilities and identify mitigation methods.”

View more customer stories