Residential Collegefalse
Status已發表Published
Noise-Regularized Advantage Value for Multi-Agent Reinforcement Learning
Wang, Siying1; Chen, Wenyu1; Hu, Jian2; Hu, Siyue3; Huang, Liwei1,4
2022-08-02
Source PublicationMathematics
ISSN2227-7390
Volume10Issue:15Pages:2728
Abstract

Leveraging global state information to enhance policy optimization is a common approach in multi-agent reinforcement learning (MARL). Even with the supplement of state information, the agents still suffer from insufficient exploration in the training stage. Moreover, training with batch-sampled examples from the replay buffer will induce the policy overfitting problem, i.e., multi-agent proximal policy optimization (MAPPO) may not perform as good as independent PPO (IPPO) even with additional information in the centralized critic. In this paper, we propose a novel noise-injection method to regularize the policies of agents and mitigate the overfitting issue. We analyze the cause of policy overfitting in actor–critic MARL, and design two specific patterns of noise injection applied to the advantage function with random Gaussian noise to stabilize the training and enhance the performance. The experimental results on the Matrix Game and StarCraft II show the higher training efficiency and superior performance of our method, and the ablation studies indicate our method will keep higher entropy of agents’ policies during training, which leads to more exploration.

KeywordAdvantage Function Exploration Multi-agent Reinforcement Learning Noise Injection Proximal Policy Optimization
DOI10.3390/math10152728
URLView the original
Indexed BySCIE
Language英語English
WOS Research AreaMathematics
WOS SubjectMathematics
WOS IDWOS:000839905800001
PublisherMDPI, ST ALBAN-ANLAGE 66, CH-4052 BASEL, SWITZERLAND
Scopus ID2-s2.0-85136796852
Fulltext Access
Citation statistics
Document TypeJournal article
CollectionTHE STATE KEY LABORATORY OF INTERNET OF THINGS FOR SMART CITY (UNIVERSITY OF MACAU)
Corresponding AuthorHu, Jian
Affiliation1.School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China
2.Graduate Institute of Networking and Multimedia, National Taiwan University, Taipei, 106, Taiwan
3.Department of Computer Science & Information Engineering, National Taiwan University, Taipei, 106, Taiwan
4.The State Key Laboratory of IoTSC, University of Macau, Taipa, 999078, Macao
Recommended Citation
GB/T 7714
Wang, Siying,Chen, Wenyu,Hu, Jian,et al. Noise-Regularized Advantage Value for Multi-Agent Reinforcement Learning[J]. Mathematics, 2022, 10(15), 2728.
APA Wang, Siying., Chen, Wenyu., Hu, Jian., Hu, Siyue., & Huang, Liwei (2022). Noise-Regularized Advantage Value for Multi-Agent Reinforcement Learning. Mathematics, 10(15), 2728.
MLA Wang, Siying,et al."Noise-Regularized Advantage Value for Multi-Agent Reinforcement Learning".Mathematics 10.15(2022):2728.
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Wang, Siying]'s Articles
[Chen, Wenyu]'s Articles
[Hu, Jian]'s Articles
Baidu academic
Similar articles in Baidu academic
[Wang, Siying]'s Articles
[Chen, Wenyu]'s Articles
[Hu, Jian]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Wang, Siying]'s Articles
[Chen, Wenyu]'s Articles
[Hu, Jian]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.