Deep reinforcement learning for multi-agents

Question

1 vote

By the multi-agent deep reinforcement learning toolbox, three agents are trained. The reward changes are as shown in the picture. Why do agents' rewards decrease and converge to an unfavorable situation after the reward increases and they move towards desired performance? I expected the process of increasing the rewards and achieving the desired goal to continue as the episode progresses. According to the picture, from episode 700, agents converge to undesired situations, and they didn't change their states.

Thank you.

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Sign in to answer this question.

Sign in to follow activity

Answer 1

Emmanouil Tzorakoleftherakis on 22 Nov 2020

Edited: Emmanouil Tzorakoleftherakis on 22 Nov 2020

1 vote

Hello,

The policies you will get from RL training change depending on the amount of time the agents spend exploring. Usually, if you see a situation like this where agents converge to a non-ideal solution, you may want to change the agent options to increase exploration.

Hope that helps

1 Comment
Show -1 older comments Hide -1 older comments

beni hadi on 25 Nov 2020

Thank you for your help.

Sign in to comment.

Deep reinforcement learning for multi-agents

0 Comments
Show -2 older comments Hide -2 older comments

Accepted Answer

1 Comment
Show -1 older comments Hide -1 older comments

More Answers (0)

Categories

Tags

Community Treasure Hunt

Deep reinforcement learning for multi-agents

0 Comments Show -2 older comments Hide -2 older comments

Accepted Answer

1 Comment Show -1 older comments Hide -1 older comments

More Answers (0)

Categories

Tags

See Also

Community Treasure Hunt

0 Comments
Show -2 older comments Hide -2 older comments

1 Comment
Show -1 older comments Hide -1 older comments