Survey on Deep Reinforcement Learning Methods Based on Sample Efficiency Optimization
Author:
Affiliation:

Clc Number:

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    Deep reinforcement learning combines the representation ability of deep learning with the decision-making ability of reinforcement learning, which has induced great research interest due to its remarkable effect in complex control tasks. This study classifies the model-free deep reinforcement learning methods into Q-value function methods and policy gradient methods by considering whether the Bellman equation is used, and introduces the two kinds of methods from the aspects of model structure, optimization process, and evaluation, respectively. Toward the low sample efficiency problem in deep reinforcement learning, this study illustrates that the over- estimation problem in Q-value function methods and the unbiased sampling constraint in policy gradient methods are the main factors that affect the sample efficiency according to model structure. Then, from the perspectives of enhancing the exploration efficiency and improving the sample exploitation rate, this study summarizes various feasible optimization methods according to the recent research hotspots and trends, analyzes advantages together with existing problems of related methods, and compares them according to the scope of application and optimization effect. Finally, this study proposes to enhance the generality of optimization methods, explore migration of optimization mechanisms between the two kinds of methods, and improve theoretical completeness as future research directions.

    Reference
    Related
    Cited by
Get Citation

张峻伟,吕帅,张正昊,于佳玉,龚晓宇.基于样本效率优化的深度强化学习方法综述.软件学报,2022,33(11):4217-4238

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:November 11,2020
  • Revised:January 18,2021
  • Adopted:
  • Online: August 02,2021
  • Published: November 06,2022
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063