Abstract:Deep Learning compilers (DL compilers) are widely applied to optimize and deploy deep learning models. Similar to traditional compilers, DL compilers also possess bugs. The buggy DL compilers can cause compilation failures, generate incorrect compilation results and even lead to disastrous consequences sometimes. To deeply understand the characteristics of DL compiler bugs, the existing works have analyzed 603 early bugs in DL compilers. In recent years, DL compilers have been updated rapidly, along with the introduction of a large number of new features and the abandonment of some old ones. At the same time, several bug detection approaches for DL compilers have been developed. Therefore, it is necessary to analyze whether the previous research conclusions on DL compiler bugs are still applicable. In addition, there is a lack of in-depth exploration of the relationship among the symptoms, root causes, and locations of bugs, and the characteristics of bug-revealing tests and bug-fixing patches have not been studied. To deeply analyze the evolution process of the current DL compiler bug characteristics and distribution over time, 613 recently fixed bugs in three mainstream DL compilers (i.e., TVM of Apache, Glow of Facebook, and AKG of Huawei) are collected in this study, and the characteristics such as root causes, symptoms and locations of bugs are manually labeled. Based on the labeling results, this study deeply explores the distribution characteristics of bugs from multiple dimensions and compares them with that in the existing works. Meanwhile, we further investigate the characteristic of bug-revealing regression tests and bug-fixing patches. In total, this study summarizes 12 major findings to comprehensively understand the current situation and evolution of DL compiler bugs and provide a series of feasible suggestions for the detection, location, and repair of DL compiler bugs. Finally, to verify the effectiveness of the research findings in this work, a testing tool CfgFuzz based on optimized configuration is developed. CfgFuzz conducts combinatorial tests on compilation configuration options and finally detects 8 TVM bugs, 7 of which have been confirmed or fixed by developers.