AutoConfig: 面向深度学习编译优化的自动配置机制

doi:10.13328/j.cnki.jos.007102

微信服务号

微信订阅号

2025年7月15日 20:50 星期二

首页 > 过刊浏览>2024年第35卷第6期 >2668-2686. DOI:10.13328/j.cnki.jos.007102

PDF HTML阅读 XML下载导出引用引用提醒

AutoConfig: 面向深度学习编译优化的自动配置机制
DOI:
                        10.13328/j.cnki.jos.007102
                    
CSTR:
                        
                    
作者:
                        张洪滨张洪滨
中国科学院大学, 北京 100049;中国科学院 软件研究所, 北京 100190
在期刊界中查找
在百度中查找
在本站中查找
周旭林周旭林
中国科学院大学, 北京 100049;中国科学院 软件研究所, 北京 100190
在期刊界中查找
在百度中查找
在本站中查找
邢明杰邢明杰
中国科学院 软件研究所, 北京 100190
在期刊界中查找
在百度中查找
在本站中查找
武延军武延军
中国科学院 软件研究所, 北京 100190
在期刊界中查找
在百度中查找
在本站中查找
赵琛赵琛
中国科学院 软件研究所, 北京 100190
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:张洪滨(1997－), 男, 博士生, CCF学生会员, 主要研究领域为编译技术.
周旭林(2001－), 男, 硕士生, CCF学生会员, 主要研究领域为编译技术.
邢明杰(1980－), 男, 高级工程师, CCF专业会员, 主要研究领域为编译技术.
武延军(1979－), 男, 博士, 博士生导师, CCF杰出会员, 主要研究领域为操作系统, 系统安全.
赵琛(1967－), 男, 博士, 博士生导师, CCF高级会员, 主要研究领域为编译技术, 操作系统, 网络软件.
通讯作者:武延军, E-mail: yanjun@iscas.ac.cn
中图分类号:
基金项目:国家重点研发计划(2022YFB4401402)

AutoConfig: Automatic Configuration Mechanism for Deep Learning Compilation Optimization

Author:

ZHANG Hong-Bin
ZHANG Hong-Bin
University of Chinese Academy of Sciences, Beijing 100049, China;Institute of Software, Chinese Academy of Sciences, Beijing 100190, China
在期刊界中查找
在百度中查找
在本站中查找
ZHOU Xu-Lin
ZHOU Xu-Lin
University of Chinese Academy of Sciences, Beijing 100049, China;Institute of Software, Chinese Academy of Sciences, Beijing 100190, China
在期刊界中查找
在百度中查找
在本站中查找
XING Ming-Jie
XING Ming-Jie
Institute of Software, Chinese Academy of Sciences, Beijing 100190, China
在期刊界中查找
在百度中查找
在本站中查找
WU Yan-Jun
WU Yan-Jun
Institute of Software, Chinese Academy of Sciences, Beijing 100190, China
在期刊界中查找
在百度中查找
在本站中查找
ZHAO Chen
ZHAO Chen
Institute of Software, Chinese Academy of Sciences, Beijing 100190, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

随着深度学习模型和硬件架构的快速发展, 深度学习编译器已经被广泛应用. 目前, 深度学习模型的编译优化和调优的方法主要依赖基于高性能算子库的手动调优和基于搜索的自动调优策略. 然而, 面对多变的目标算子和多种硬件平台的适配需求, 高性能算子库往往需要为各种架构进行多次重复实现. 此外, 现有的自动调优方案也面临着搜索开销大和缺乏可解释性的挑战. 为了解决上述问题, 提出AutoConfig, 一种面向深度学习编译优化的自动配置机制. 针对不同的深度学习计算负载和特定的硬件平台, AutoConfig可以构建具备可解释性的优化算法分析模型, 采用静态信息提取和动态开销测量的方法进行综合分析, 并基于分析结果利用可配置的代码生成技术自动完成算法选择和调优. AutoConfig创新性地将优化分析模型与可配置的代码生成策略相结合, 不仅能保证性能加速效果, 还能减少重复开发的开销, 同时可以简化调优过程. 在此基础上, 进一步将AutoConfig集成到深度学习编译器Buddy Compiler中, 对矩阵乘法和卷积的多种优化算法建立分析模型, 并将自动配置的代码生成策略应用在多种SIMD硬件平台上进行评估. 实验结果可验证AutoConfig在代码生成策略中完成参数配置和算法选择的有效性. 与经过手动或自动优化的代码相比, 由AutoConfig生成的代码可达到相似的执行性能, 并且无需承担手动调优的重复实现开销和自动调优的搜索开销.

关键词:深度学习编译器;编译优化;代码生成;自动配置机制

Abstract:

Deep learning compilers have been widely employed with the rapid development of deep learning models and hardware architectures. At present, the compilation optimization and tuning methods of deep learning models mainly rely on high-performance operator libraries and automatic compiler tuning. However, facing various target operators and adaptation requirements of several hardware platforms, high-performance operator libraries should conduct multiple implementations for different architectures. Additionally, existing auto-tuning schemes face challenges in substantial search overheads and interpretability. To this end, this study proposes AutoConfig, an automatic configuration mechanism for deep learning compilation optimization. Targeting different deep learning workloads and multiple hardware platforms, AutoConfig builds interpretable performance analysis models, conducts a thorough assessment via static information extraction and dynamic overhead measurement, and automates algorithm selection and configuration tuning for code generation. The key innovation of this study is combining the optimization analysis model and a configurable code generation strategy, which ensures a performance acceleration effect and reduces repeated development overheads with the simplified tuning process. Furthermore, this study integrates AutoConfig into a deep learning compiler Buddy Compiler, builds analysis models for convolution and matrix multiplication optimization, and evaluates the optimization on multiple SIMD hardware platforms. Experimental results indicate that AutoConfig effectively completes parameter configuration and algorithm selection in the code generation strategy. Additionally, compared with the codes by manual or automatic optimization, the codes generated by AutoConfig can yield comparable performance without both the repeated manual tuning implementation overheads and auto-tuning search overheads.

Key words:deep learning compiler;compilation optimization;code generation;automatic configuration mechanism

引用本文

张洪滨,周旭林,邢明杰,武延军,赵琛. AutoConfig: 面向深度学习编译优化的自动配置机制.软件学报,2024,35(6):2668-2686

复制

文章指标

点击次数:1417
下载次数: 3804
HTML阅读次数: 1546
引用次数: 0

历史

收稿日期:2023-09-11
最后修改日期:2023-10-30
录用日期:
在线发布日期: 2024-01-05
出版日期: 2024-06-06

微信服务号

微信订阅号

引用本文

相关视频

分享

文章指标

历史

文章二维码

微信服务号

微信订阅号

引用本文

相关视频

分享

微信扫一扫：分享

文章指标

历史

文章二维码