强化学习可解释性基础问题探索和方法综述
CSTR:
作者:
作者单位:

作者简介:

刘潇(1991-),男,博士生,主要研究领域为可解释强化学习,机器学习,计算机视觉;刘书洋(1999-),男,硕士生,主要研究领域为强化学习,可解释强化学习;庄韫恺(1990-),男,博士生,主要研究领域为多智能体系统,强化学习,博弈论;高阳(1972-),男,博士,教授,博士生导师,CCF杰出会员,主要研究领域为强化学习,多智能体系统,计算机视觉,大数据分析

通讯作者:

高阳,gaoy@nju.edu.cn

中图分类号:

基金项目:

科技创新2030—“新一代人工智能”重大项目(2018AAA0100900)


Explainable Reinforcement Learning: Basic Problems Exploration and Method Survey
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    强化学习是一种从试错过程中发现最优行为策略的技术,已经成为解决环境交互问题的通用方法.然而,作为一类机器学习算法,强化学习也面临着机器学习领域的公共难题,即难以被人理解.缺乏可解释性限制了强化学习在安全敏感领域中的应用,如医疗、驾驶等,并导致强化学习在环境仿真、任务泛化等问题中缺乏普遍适用的解决方案.为了克服强化学习的这一弱点,涌现了大量强化学习可解释性(explainable reinforcement learning,XRL)的研究.然而,学术界对XRL尚缺乏一致认识.因此,探索XRL的基础性问题,并对现有工作进行综述.具体而言,首先探讨父问题——人工智能可解释性,对人工智能可解释性的已有定义进行了汇总;其次,构建一套可解释性领域的理论体系,从而描述XRL与人工智能可解释性的共同问题,包括界定智能算法和机械算法、定义解释的含义、讨论影响可解释性的因素、划分解释的直观性;然后,根据强化学习本身的特征,定义XRL的3个独有问题,即环境解释、任务解释、策略解释;之后,对现有方法进行系统地归类,并对XRL的最新进展进行综述;最后,展望XRL领域的潜在研究方向.

    Abstract:

    Reinforcement learning is a technique that discovers optimal behavior strategies in a trial-and-error way, and it has become a general method for solving environmental interaction problems. However, as a machine learning method, reinforcement learning faces a common problem in machine learning, or in other words, it is unexplainable. The unexplainable problem limits the application of reinforcement learning in safety-sensitive fields, e.g., medical treatment and transportation, and it leads to a lack of universally applicable solutions in environmental simulation and task generalization. In order to address the problem, extensive research on explainable reinforcement learning (XRL) has emerged. However, academic members still have an inconsistent understanding of XRL. Therefore, this study explores the basic problems of XRL and reviews existing works. To begin with, the study discusses the parent problem, i.e., explainable artificial intelligence, and summarizes its existing definitions. Next, it constructs a theoretical system of interpretability to describe the common problems of XRL and explainable artificial intelligence. To be specific, it distinguishes between intelligent algorithms and mechanical algorithms, defines interpretability, discusses factors that affect interpretability, and classifies the intuitiveness of interpretability. Then, based on the characteristics of reinforcement learning, the study defines three unique problems of XRL, i.e., environmental interpretation, task interpretation, and strategy interpretation. After that, the latest research on XRL is reviewed, and the existing methods were systematically classified. Finally, the future research directions of XRL are put forward.

    参考文献
    相似文献
    引证文献
引用本文

刘潇,刘书洋,庄韫恺,高阳.强化学习可解释性基础问题探索和方法综述.软件学报,2023,34(5):2300-2316

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2021-02-23
  • 最后修改日期:2021-07-16
  • 录用日期:
  • 在线发布日期: 2021-10-20
  • 出版日期: 2023-05-06
文章二维码
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号