深度学习编译器缺陷实证研究:现状与演化分析
CSTR:
作者:
作者单位:

作者简介:

通讯作者:

陈俊洁,E-mail:junjiechen@tju.edu.cn

中图分类号:

TP311

基金项目:

国家自然科学基金(62322208,12411530122);


Toward Understanding the Current Status and Evolution of Deep Learning Compiler Bugs
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    深度学习编译器已被广泛应用于深度学习模型的性能优化和部署。与传统编译器类似,深度学习编译器也存在缺陷,存在缺陷的深度学习编译器会导致编译失败或者产生错误的编译结果,甚至有时会带来灾难性的后果。为了深入理解深度学习编译器缺陷的特性,已有工作针对深度学习编译器早期的603个缺陷进行研究分析。近年来,深度学习编译器在快速迭代更新,伴随着大量新特性的引入和旧特性的弃用。与此同时,一些针对深度学习编译器缺陷检测工具已经被开发出来。因此,需要分析之前对深度学习编译器缺陷的研究结论是否依然适用。此外,针对缺陷症状、根因与位置三者之间的关系缺乏深入的挖掘,并且缺乏针对关于触发缺陷的回归测试用例特征和修复缺陷的补丁特征的研究。为了深入分析当下深度学习编译器缺陷特征和缺陷分布随时间的演化过程,本文收集了当前三款主流深度学习编译器(即Apache的TVM、Facebook的Glow和华为的AKG)中的613个近期修复的缺陷,并对缺陷的根因、症状、位置等特征进行了人工标注。基于标注结果,本文从多个不同角度深入挖掘缺陷的分布特征并与已有研究发现进行对比分析。同时,本文对触发缺陷的回归测试用例和修复缺陷的补丁进行了研究。本文最终获得了12个主要研究发现,以全面了解深度学习编译器缺陷现状与演变过程,并为深度学习编译器缺陷的检测、定位、修复提供了一系列可行的指导方案。最后,为了验证本文研究发现的有效性,开发了一款基于优化配置的模糊测试工具CfgFuzz。CfgFuzz通过对编译配置选项进行组合测试,最终检测到了8个TVM缺陷,其中7个缺陷已经被开发人员确认或修复。

    Abstract:

    Deep Learning compilers (DL compilers) have been widely used in model optimization and deployment. Like traditional compilers, DL compilers also contain bugs. Buggy DL compilers could lead to compilation failure, output incorrect compilation results, and even bring catastrophic consequences. To investigate the characteristics of DL compiler bugs, existing work has studied and analyzed 603 early DL compiler bugs. In recent years, DL compilers have been updated frequently, accompanied by the introduction of many new features and the deprecation of some old features. At the same time, several testing approaches for DL compilers have been proposed. It is unknown whether the previous research findings on DL compiler bugs are still applicable. In addition, there is a lack of in-depth exploration of the relationship among symptoms, root causes, and locations of bugs, and the characteristics of regression test cases that trigger bugs and patches that fix bugs have not been studied. To deeply understand the evolution of current DL compiler bug characteristics and distribution over time, this paper collects 613 recently fixed bugs in three popular DL compilers and labels the root causes, symptoms, and locations for each bug. Then, based on the labeled results, this paper deeply explores the distribution characteristics of bugs from multiple angles and compares them with those in existing research work. At the same time, we also study the characteristics of the patches involved in fixing the bugs and the regression test cases that trigger the bugs. In total, we summarized 12 major findings to fully understand DL compiler bugs and their evolution, and provide a series of feasible suggestions for detecting, localizing, and repairing DL compiler bugs. Finally, to evaluate the usefulness of our findings, we developed a proof-of-concept TVM testing tool, called CfgFuzz, based on optimization configuration. CfgFuzz performs combinatorial testing on compilation configuration and detects 8 TVM bugs, 7 of which have been confirmed or fixed by developers.

    参考文献
    相似文献
    引证文献
引用本文

沈庆超,田家硕,陈俊洁,陈翔,陈庆燕,王赞.深度学习编译器缺陷实证研究:现状与演化分析.软件学报,2025,36(7):0

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2024-08-24
  • 最后修改日期:2024-10-15
  • 录用日期:
  • 在线发布日期: 2024-12-10
  • 出版日期:
文章二维码
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号