MATLAB Answers

What is the best activation function to get action between 0 and 1 in DDPG network?

11 views (last 30 days)
I am using DDPG network to run a control algorithm which has inputs (actions of RL agent, 23 in total) varying between 0 and 1. I an defining this using rlNumericSpec
actInfo = rlNumericSpec([numAct 1],'LowerLimit',0,'UpperLimit', 1);
Then I am using tanhLayer in the actor network (similar to bipedal robot example) and then using
actorOptions = rlRepresentationOptions('Optimizer','adam','LearnRate',1e-4, 'GradientThreshold',1,'L2RegularizationFactor',1e-5);
actor = rlRepresentation(actorNetwork,env.getObservationInfo,env.getActionInfo, 'Observation',{'observation'}, 'Action',{'ActorTanh1'},actorOptions);
But i feel that the model is only taking the extreme options ie mostly 0 and 1.
Will it be better to use a sigmoid function to get better action estimates?


Sign in to comment.

Accepted Answer

Emmanouil Tzorakoleftherakis
With DDPG, a common thing to do in the final 3 layers of the actor is to use a fully connected layer, a tanh layer and a scaling layer. Tanh will get the ouput of that layer between -1 and 1 and then you can use the scaling layer to scale/shift values as needed based on the specifications of the actuator in your problem.
It seems the problem here is due to noise that is being added during training with DDPG to allow sufficient exploration (for example see step 1 here). The default noise options have a pretty high variance, so when this is added to the output of the tanh layer, it ends up outside the [0, 1] range and is being clipped. This is why you are only getting the two extremes.
Try adjusting the DDPG noise options, and particularly the variance (make it smaller, e.g. <=0.1). Also, see here for some best practices when choosing noise parameters.
Hope that helps


Show 8 older comments
Sayak Mukherjee
Sayak Mukherjee on 15 Oct 2020
I am not getting expected results with play, still action is -ve in some cases.
Is the issue with version?
Emmanouil Tzorakoleftherakis
No, I think this is a bug. I would contact technical support at this point to put you in touch with the development team and see if they can provide a workaround. You will most likely need to provide a reproduction model as well.

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!