Reinforcement Learning Toolbox - When does algorithm train?

1 view (last 30 days)
I am currently using the RL-Toolbox with a DQN-Agent built into a long-running process-simulation.
The maximum stepcount is currently 8000 steps per episode.
Unfortunately the documentation seems a little ambiguous to me, so here my question:
Doese the train-function of the RL-Toolbox train the agent at the end of an episode or during the episode when the step count exeeds the minibatch-size (like in the baseline algorithms)?
Thank you in advance.

Accepted Answer

Emmanouil Tzorakoleftherakis
The implementation is based on the algorithm listed here.
Weights are being updated at each time step.
  1 Comment
Hans-Joachim Steinort
Hans-Joachim Steinort on 26 Sep 2019
"For each training time step" - that was the line I was looking for (yet looking into the source code lead me to the same conclusion).
After double-checking the baseline-algorithms I found that they do it the same way.
Thank you for your time!

Sign in to comment.

More Answers (0)

Products


Release

R2019a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!