元强化学习研究综述

doi:10.13328/j.cnki.jos.007011

微信服务号

微信订阅号

首页 > 过刊浏览>2024年第35卷第4期 >1618-1650. DOI:10.13328/j.cnki.jos.007011

PDF HTML阅读 XML下载导出引用引用提醒

元强化学习研究综述
DOI:
                        10.13328/j.cnki.jos.007011
                    
作者:
                        
                        
                    
作者单位:
作者简介:
通讯作者:高阳,E-mail:gaoy@nju.edu.cn
中图分类号:
基金项目:科技创新2030—“新一代人工智能”重大项目(2021ZD0113303);国家自然科学基金(62192783, 62276128)

Survey of Meta-reinforcement Learning Research

Author:

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

近年来, 深度强化学习(deep reinforcement learning, DRL)已经在诸多序贯决策任务中取得瞩目成功, 但当前, 深度强化学习的成功很大程度依赖于海量的学习数据与计算资源, 低劣的样本效率和策略通用性是制约其进一步发展的关键因素. 元强化学习(meta-reinforcement learning, Meta-RL)致力于以更小的样本量适应更广泛的任务, 其研究有望缓解上述限制从而推进强化学习领域发展. 以元强化学习工作的研究对象与适用场景为脉络, 对元强化学习领域的研究进展进行了全面梳理: 首先, 对深度强化学习、元学习背景做基本介绍; 然后, 对元强化学习作形式化定义及常见的场景设置总结, 并从元强化学习研究成果的适用范围角度展开介绍元强化学习的现有研究进展; 最后, 分析了元强化学习领域的研究挑战与发展前景.

Abstract:

In recent years, deep reinforcement learning (DRL) has achieved remarkable success in many sequential decision-making tasks. However, the current success of deep reinforcement learning heavily relies on massive learning data and computing resources. The poor sample efficiency and strategy generalization ability are the key factors restricting DRL’s further development. Meta-reinforcement learning (Meta-RL) studies to adapt to a wider range of tasks with a smaller sample size. Related researches are expected to alleviate the above limitations and promote the development of reinforcement learning. Taking the scope of research object and application range of current research works, this study comprehensively combs the research progress in the field of meta-reinforcement learning. Firstly, a basic introduction is given to deep reinforcement learning and the background of meta-reinforcement learning. Then, meta-reinforcement learning is formally defined and common scene settings are summarized, and the current research progress of meta-reinforcement learning is also introduced from the perspective of application range of the research results. Finally, the research challenges and potential future development directions are discussed.

参考文献

相似文献

引证文献

引用本文

陈奕宇,霍静,丁天雨,高阳.元强化学习研究综述.软件学报,2024,35(4):1618-1650

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2023-05-14
最后修改日期:2023-07-07
录用日期:
在线发布日期: 2023-09-11
出版日期: 2024-04-06

微信服务号

微信订阅号

引用本文

分享

文章指标

历史

文章二维码