HUANG Zhi-Gang
School of Computer Science and Technology, Soochow University, Suzhou 215006, ChinaLIU Quan
School of Computer Science and Technology, Soochow University, Suzhou 215006, China;Jiangsu Key Laboratory for Computer Information Processing Technology (Soochow University), Suzhou 215006, China;Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education (Jilin University), Changchun 130012, China;Collaborative Innovation Center of Novel Software Technology and Industrialization (Nanjing), Nanjing 210093, ChinaZHANG Li-Hua
School of Computer Science and Technology, Soochow University, Suzhou 215006, ChinaCAO Jia-Qing
School of Computer Science and Technology, Soochow University, Suzhou 215006, ChinaZHU Fei
School of Computer Science and Technology, Soochow University, Suzhou 215006, China;Jiangsu Key Laboratory for Computer Information Processing Technology (Soochow University), Suzhou 215006, China;Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education (Jilin University), Changchun 130012, China;Collaborative Innovation Center of Novel Software Technology and Industrialization (Nanjing), Nanjing 210093, ChinaDeep hierarchical reinforcement learning (DHRL) is an important research field in deep reinforcement learning (DRL). It focuses on sparse reward, sequential decision, and weak transfer ability problems, which are difficult to be solved by classic DRL. DHRL decomposes complex problems and constructs a multi-layered structure for DRL strategies based on hierarchical thinking. By using temporal abstraction, DHRL combines lower-level actions to learn semantic higher-level actions. In recent years, with the development of research, DHRL has been able to make breakthroughs in many domains and shows a strong performance. It has been applied to visual navigation, natural language processing, recommendation system and video description generation fields in real world. In this study, the theoretical basis of hierarchical reinforcement learning (HRL) is firstly introduced. Secondly, the key technologies of DHRL are described, including hierarchical abstraction techniques and common experimental environments. Thirdly, taking the option-based deep hierarchical reinforcement learning framework (O-DHRL) and the subgoal-based deep hierarchical reinforcement learning framework (G-DHRL) as the main research objects, those research status and development trend of various algorithms are analyzed and compared in detail. In addition, a number of DHRL applications in real world are discussed. Finally, DHRL is prospected and summarized.
黄志刚,刘全,张立华,曹家庆,朱斐.深度分层强化学习研究与发展.软件学报,2023,34(2):733-760
Copy