深度学习编译器缺陷实证研究: 现状与演化分析
作者:
通讯作者:

陈俊洁,E-mail:junjiechen@tju.edu.cn

中图分类号:

TP311

基金项目:

国家自然科学基金(62322208, 12411530122)


Toward Understanding the Current Status and Evolution of Deep Learning Compiler Bugs
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • | |
  • 文章评论
    摘要:

    深度学习编译器已被广泛应用于深度学习模型的性能优化和部署. 与传统编译器类似, 深度学习编译器也存在缺陷. 存在缺陷的深度学习编译器会导致编译失败或者产生错误的编译结果, 甚至有时会带来灾难性的后果. 为了深入理解深度学习编译器缺陷的特性, 已有工作针对深度学习编译器早期的603个缺陷进行研究分析. 近年来, 深度学习编译器在快速迭代更新, 伴随着大量新特性的引入和旧特性的弃用. 与此同时, 一些针对深度学习编译器缺陷的检测工具已被开发出来. 因此, 需要分析之前对深度学习编译器缺陷的研究结论是否依然适用. 此外, 缺乏对缺陷症状、根因、位置三者之间关系的深入挖掘, 并且缺乏对触发缺陷的回归测试用例特征和修复缺陷的补丁特征的研究. 为了深入分析当下深度学习编译器缺陷特征和缺陷分布随时间的演化过程, 收集当前3款主流深度学习编译器(即Apache的TVM、Facebook的Glow和华为的AKG)中的613个近期修复的缺陷, 并对缺陷的根因、症状、位置等特征进行人工标注. 基于标注结果, 从多个不同角度深入挖掘缺陷的分布特征,并与已有研究进行对比分析. 同时, 对触发缺陷的回归测试用例和修复缺陷的补丁进行研究. 最终获得12个主要研究发现, 以全面了解深度学习编译器缺陷现状与演变过程, 并为深度学习编译器缺陷的检测、定位、修复提供一系列可行的指导方案. 最后, 为了验证本文研究发现的有效性, 开发了一款基于优化配置的测试工具CfgFuzz. CfgFuzz通过对编译配置选项进行组合测试, 最终检测到8个TVM缺陷, 其中7个缺陷已被开发人员确认或修复.

    Abstract:

    Deep Learning compilers (DL compilers) are widely applied to optimize and deploy deep learning models. Similar to traditional compilers, DL compilers also possess bugs. The buggy DL compilers can cause compilation failures, generate incorrect compilation results and even lead to disastrous consequences sometimes. To deeply understand the characteristics of DL compiler bugs, the existing works have analyzed 603 early bugs in DL compilers. In recent years, DL compilers have been updated rapidly, along with the introduction of a large number of new features and the abandonment of some old ones. At the same time, several bug detection approaches for DL compilers have been developed. Therefore, it is necessary to analyze whether the previous research conclusions on DL compiler bugs are still applicable. In addition, there is a lack of in-depth exploration of the relationship among the symptoms, root causes, and locations of bugs, and the characteristics of bug-revealing tests and bug-fixing patches have not been studied. To deeply analyze the evolution process of the current DL compiler bug characteristics and distribution over time, 613 recently fixed bugs in three mainstream DL compilers (i.e., TVM of Apache, Glow of Facebook, and AKG of Huawei) are collected in this study, and the characteristics such as root causes, symptoms and locations of bugs are manually labeled. Based on the labeling results, this study deeply explores the distribution characteristics of bugs from multiple dimensions and compares them with that in the existing works. Meanwhile, we further investigate the characteristic of bug-revealing regression tests and bug-fixing patches. In total, this study summarizes 12 major findings to comprehensively understand the current situation and evolution of DL compiler bugs and provide a series of feasible suggestions for the detection, location, and repair of DL compiler bugs. Finally, to verify the effectiveness of the research findings in this work, a testing tool CfgFuzz based on optimized configuration is developed. CfgFuzz conducts combinatorial tests on compilation configuration options and finally detects 8 TVM bugs, 7 of which have been confirmed or fixed by developers.

    参考文献
    相似文献
    引证文献
引用本文

沈庆超,田家硕,陈俊洁,陈翔,陈庆燕,王赞.深度学习编译器缺陷实证研究: 现状与演化分析.软件学报,2025,36(7):1-19

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2024-08-24
  • 最后修改日期:2024-10-15
  • 在线发布日期: 2024-12-10
文章二维码
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号