Please help the Examples agent worked fine for the first order system. but for a second order system the training refuses to converge. i have ran multiple simulations up 50,000. I don't know if its the reward fuction that worked for first order can't work for second order systems? Thank you very much
How do I Reinforcement Learning Parameters for a Water Tank System with Second-Order Dynamics Using TD3 Agents
5 views (last 30 days)
Show older comments
I am trying adjust reinforcement learning (RL) parameters for the Generate Reward Function from a Model Verification Block for a water tank system that represents the second-order transfer function , and using Twin Delayed Deep Deterministic Policy Gradient (TD3) Agents, you need to consider a few key aspects that influence RL performance. Since you’ve already adjusted weights, reward function methods, and the learning rate but are still facing issues, let’s walk through a structured approach to fine-tuning the model and reinforcement learning parameters.
4 Comments
Sam Chak
on 14 Oct 2024
It is unclear what you are instructing the TD3 control agent to do. Although the reinforcement learning controller functions similarly to "Aladdin's magic lamp," you still need to make your wishes clearly and explicitly.
The following is an optimal PID controller that drives the plant to settle exactly at 20 seconds without overshoot. It is likely that no other controllers can perform better than this one (perhaps only on par with it) because the performance objectives are clearly defined.
Did you instruct the TD3 agents to search for the optimal PID gain values based on structured observations of error, integral error, and derivative error, or did you allow them to explore freely and produce random control actions?
s = tf('s');
% Plant
Gp = 1/(24.4*s^2 + 12.2*s + 1)
stepinfo(Gp)
% PID controller
kp = 1.5293491321269;
ki = 0.145848289518598;
kd = 0.937225744461437;
Tf = 1.71410992083058;
Gc = pid(kp, ki, kd, Tf)
% Closed-loop system
Gcl = feedback(Gc*Gp, 1)
S = stepinfo(Gcl)
% Plot results
step(Gp ), hold on
step(Gcl), grid on
ylim([0, 1.2])
legend('Plant response', 'Closed-loop system', 'location', 'east')
Answers (1)
See Also
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!