元强化学习研究综述

doi:10.13328/j.cnki.jos.007011

微信服务号

微信订阅号

2025年5月1日 23:29 星期四

首页 > 过刊浏览>2024年第35卷第4期 >1618-1650. DOI:10.13328/j.cnki.jos.007011

PDF HTML阅读 XML下载导出引用引用提醒

元强化学习研究综述
DOI:
                        10.13328/j.cnki.jos.007011
                    
CSTR:
                        
                    
作者:
                        陈奕宇陈奕宇
南京大学 计算机科学与技术系, 江苏 南京 210043;计算机软件新技术国家重点实验室(南京大学), 江苏 南京 210043
在期刊界中查找
在百度中查找
在本站中查找
霍静霍静
南京大学 计算机科学与技术系, 江苏 南京 210043;计算机软件新技术国家重点实验室(南京大学), 江苏 南京 210043
在期刊界中查找
在百度中查找
在本站中查找
丁天雨丁天雨
Applied Sciences Group, Microsoft, Redmond, WA 98034, USA
在期刊界中查找
在百度中查找
在本站中查找
高阳高阳
南京大学 计算机科学与技术系, 江苏 南京 210043;计算机软件新技术国家重点实验室(南京大学), 江苏 南京 210043
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:
通讯作者:高阳,E-mail:gaoy@nju.edu.cn
中图分类号:
基金项目:科技创新2030—“新一代人工智能”重大项目(2021ZD0113303);国家自然科学基金(62192783, 62276128)

Survey of Meta-reinforcement Learning Research

Author:

CHEN Yi-Yu
CHEN Yi-Yu
Department of Computer Science and Technology, Nanjing University, Nanjing 210043, China;State Key Laboratory for Novel Software Technology(Nanjing University), Nanjing 210043, China
在期刊界中查找
在百度中查找
在本站中查找
HUO Jing
HUO Jing
Department of Computer Science and Technology, Nanjing University, Nanjing 210043, China;State Key Laboratory for Novel Software Technology(Nanjing University), Nanjing 210043, China
在期刊界中查找
在百度中查找
在本站中查找
DING Tian-Yu
DING Tian-Yu
Applied Sciences Group, Microsoft, Redmond, WA 98034, USA
在期刊界中查找
在百度中查找
在本站中查找
GAO Yang
GAO Yang
Department of Computer Science and Technology, Nanjing University, Nanjing 210043, China;State Key Laboratory for Novel Software Technology(Nanjing University), Nanjing 210043, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

近年来, 深度强化学习(deep reinforcement learning, DRL)已经在诸多序贯决策任务中取得瞩目成功, 但当前, 深度强化学习的成功很大程度依赖于海量的学习数据与计算资源, 低劣的样本效率和策略通用性是制约其进一步发展的关键因素. 元强化学习(meta-reinforcement learning, Meta-RL)致力于以更小的样本量适应更广泛的任务, 其研究有望缓解上述限制从而推进强化学习领域发展. 以元强化学习工作的研究对象与适用场景为脉络, 对元强化学习领域的研究进展进行了全面梳理: 首先, 对深度强化学习、元学习背景做基本介绍; 然后, 对元强化学习作形式化定义及常见的场景设置总结, 并从元强化学习研究成果的适用范围角度展开介绍元强化学习的现有研究进展; 最后, 分析了元强化学习领域的研究挑战与发展前景.

关键词:元强化学习;强化学习;深度强化学习;元学习

Abstract:

In recent years, deep reinforcement learning (DRL) has achieved remarkable success in many sequential decision-making tasks. However, the current success of deep reinforcement learning heavily relies on massive learning data and computing resources. The poor sample efficiency and strategy generalization ability are the key factors restricting DRL’s further development. Meta-reinforcement learning (Meta-RL) studies to adapt to a wider range of tasks with a smaller sample size. Related researches are expected to alleviate the above limitations and promote the development of reinforcement learning. Taking the scope of research object and application range of current research works, this study comprehensively combs the research progress in the field of meta-reinforcement learning. Firstly, a basic introduction is given to deep reinforcement learning and the background of meta-reinforcement learning. Then, meta-reinforcement learning is formally defined and common scene settings are summarized, and the current research progress of meta-reinforcement learning is also introduced from the perspective of application range of the research results. Finally, the research challenges and potential future development directions are discussed.

Key words:meta-reinforcement learning;reinforcement learning;deep reinforcement learning;meta-learning

引用本文

陈奕宇,霍静,丁天雨,高阳.元强化学习研究综述.软件学报,2024,35(4):1618-1650

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2023-05-14
最后修改日期:2023-07-07
录用日期:
在线发布日期: 2023-09-11
出版日期: 2024-04-06

微信服务号

微信订阅号

引用本文

分享

文章指标

历史

文章二维码

微信服务号

微信订阅号

引用本文

分享

微信扫一扫：分享

文章指标

历史

文章二维码