PPO Agent - Initialization of actor and critic newtorks
Show older comments
Whenever a PPO agent is initialized in Matlab, according to the documentation the parameters of both the actor and the critic are set randomly. However I know that this is not the only possible choice: other initialization schemes are possible (e.g. orthogonal initialization), and this can sometimes improve the future performance of the agent.
- Is there a reason why the random initialization has been chosen as the default method here?
- Is it possible to specify a different initialization method easily in the context of Reinforcement learning Toolbox, without starting from scratch?
Accepted Answer
More Answers (0)
Categories
Find more on Reinforcement Learning in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!