Reinforcement Learning algorithm to tune PID parameters of a system
Show older comments
- Hello, I try to tune a PID controller for a second order system model using Reinforcement Learning like the example of Tune PI Controller using Reinforcement Learning in MATLAB ( https://it.mathworks.com/help/reinforcement-learning/ug/tune-pi-controller-using-td3.html )
- I created the system model with PID called “System_PID.slx” and simulated it to get satisfactory result.
- I also created the system model with RL_agent block “ rl_PID_Tune_mod .slx”
- I am running the code for 100 iterations.
- The ‘Agent’ loaded in workspace after training also gives satisfactory performance when it is considered as standalone controller.
- But, the value of kp, ki, kd are also generated but the value of proportional gain (kp) which is extracted from training, is very small nearly zero. As a result, the extracted Kp, Ki and Kd from ‘Agent’ are not giving the satisfactory result.
- That’s why the system is going to be unstable
- So where is the problem in the configuration of the code as given below.
- Any help is highly obelized.
simOpts = rlSimulationOptions('MaxSteps',maxsteps);
experiences = sim(env,agent,simOpts);
actor = getActor(agent);
parameters = getLearnableParameters(actor);
Ki = abs(parameters{1}(1))
Kp = abs(parameters{1}(2))
Kd = abs(parameters{1}(3))
Answers (1)
I didn't check everything, but I noticed that you reused the same quadratic cost function (Reward in RL terms) from the 'rlwatertankPIDTune.slx' example for your second-order plant,
. The water level control system in a tank is a first-order system. In the example, the cost is chosen as
. where the weighing factors
and
, and a PI controller is sufficient.
. where the weighing factors Since you want to tune a PID controller for a second-order system using RL, perhaps it is appropriate to define the cost as
where
and
. One challenge is that you need to estimate
because only the output of the second-order system is measurable.
where
and a = 1; % tf numerator
b = [1 1 0]; % tf denominator
% [A, B, C, D] = tf2ss(a, b)
sys = ss(tf(a, b)) % state-space
In continuous-time and in the absence of Gaussian noise, the PID controller gains may be tuned as follows:
Kp = 0.75;
Ki = 0;
Kd = 0.5;
N = 3;
Gpid = pid(Kp, Ki, Kd, 1/N)
Categories
Find more on Reinforcement Learning in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!