Reinforcement Learning algorithm to tune PID parameters of a system

Question

0 votes

Hello, I try to tune a PID controller for a second order system model using Reinforcement Learning like the example of Tune PI Controller using Reinforcement Learning in MATLAB ( https://it.mathworks.com/help/reinforcement-learning/ug/tune-pi-controller-using-td3.html )
I created the system model with PID called “System_PID.slx” and simulated it to get satisfactory result.
I also created the system model with RL_agent block “ rl_PID_Tune_mod .slx”
I am running the code for 100 iterations.
The ‘Agent’ loaded in workspace after training also gives satisfactory performance when it is considered as standalone controller.
But, the value of kp, ki, kd are also generated but the value of proportional gain (kp) which is extracted from training, is very small nearly zero. As a result, the extracted Kp, Ki and Kd from ‘Agent’ are not giving the satisfactory result.
That’s why the system is going to be unstable
So where is the problem in the configuration of the code as given below.
Any help is highly obelized.

simOpts = rlSimulationOptions('MaxSteps',maxsteps);

experiences = sim(env,agent,simOpts);

actor = getActor(agent);

parameters = getLearnableParameters(actor);

Ki = abs(parameters{1}(1))

Kp = abs(parameters{1}(2))

Kd = abs(parameters{1}(3))

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

Sam Chak on 1 Sep 2023

Open in MATLAB Online

0 votes

Hi @Bipratip,

I didn't check everything, but I noticed that you reused the same quadratic cost function (Reward in RL terms) from the 'rlwatertankPIDTune.slx' example for your second-order plant,

. The water level control system in a tank is a first-order system. In the example, the cost is chosen as

. where the weighing factors

and

, and a PI controller is sufficient.

Since you want to tune a PID controller for a second-order system using RL, perhaps it is appropriate to define the cost as

where

and

. One challenge is that you need to estimate

because only the output of the second-order system is measurable.

a    =  1;          % tf numerator
b    = [1 1 0];     % tf denominator
% [A, B, C, D] = tf2ss(a, b)
sys  = ss(tf(a, b)) % state-space
sys =
 
  A = 
       x1  x2
   x1  -1   0
   x2   1   0
 
  B = 
       u1
   x1   1
   x2   0
 
  C = 
       x1  x2
   y1   0   1
 
  D = 
       u1
   y1   0
 
Continuous-time state-space model.

In continuous-time and in the absence of Gaussian noise, the PID controller gains may be tuned as follows:

Kp   = 0.75;
Ki   = 0;
Kd   = 0.5;
N    = 3;
Gpid = pid(Kp, Ki, Kd, 1/N)
Gpid =
 
               s    
  Kp + Kd * --------
             Tf*s+1 

  with Kp = 0.75, Kd = 0.5, Tf = 0.333
 
Continuous-time PDF controller in parallel form.

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Reinforcement Learning algorithm to tune PID parameters of a system

0 Comments
Show -2 older comments Hide -2 older comments

Answers (1)

0 Comments
Show -2 older comments Hide -2 older comments

Categories

Products

Release

Tags

Community Treasure Hunt

Reinforcement Learning algorithm to tune PID parameters of a system

0 Comments Show -2 older comments Hide -2 older comments

Answers (1)

0 Comments Show -2 older comments Hide -2 older comments

Categories

Products

Release

Tags

See Also

Community Treasure Hunt

0 Comments
Show -2 older comments Hide -2 older comments

0 Comments
Show -2 older comments Hide -2 older comments