一种超低损失的深度神经网络量化压缩方法
CSTR:
作者:
作者单位:

作者简介:

龚成(1993-),男,博士生,CCF学生会员,主要研究领域为神经网络压缩,高性能嵌入式系统,异构计算,人工智能.
刘方鑫(1996-),男,硕士,主要研究领域为神经网络压缩,异构计算,人工智能.
卢冶(1986-),男,博士,副教授,CCF专业会员,主要研究领域为神经网络压缩,高性能嵌入式系统,异构计算,人工智能.
陈新伟(1984-),男,博士,副教授,主要研究领域为机器人控制技术,工业视觉系统,移动机器人系统.
代素蓉(1997-),女,硕士生,CCF学生会员,主要研究领域为神经网络压缩,机器学习,异构计算.
李涛(1977-),男,博士,教授,博士生导师,CCF杰出会员,主要研究领域为异构计算,机器学习,物联网.

通讯作者:

卢冶,E-mail:luye@nankai.edu.cn

中图分类号:

TP181

基金项目:

国家重点研发计划(2018YFB2100300);国家自然科学基金(62002175,61872200);天津自然科学基金(19JCZDJC31600,19JCQNJC00600);计算机体系结构国家重点实验室(中国科学院计算技术研究所)开放课题(CARCHB202016,CARCH201905);中国高校产学研创新基金(2020HYA01003);工业机器人应用福建省高校工程研究中心(闽江学院)开放基金(MJUKF-IRA1902)


Ultra-low Loss Quantization Method for Deep Neural Network Compression
Author:
Affiliation:

Fund Project:

National Key Research and Development Program of China (2018YFB2100300); National Natural Science Foundation of China (62002175, 61872200); Natural Science Foundation of Tianjin Municipality (19JCZDJC31600, 19JCQNJC00600); Open Fund of State Key Laboratory of Computer Architecture (Institute of Computing Technology, Chinese Academy of Sciences) (CARCHB202016, CARCH201905); Innovation Fund of Chinese Universities Industry-University-Research (2020HYA01003); Open Fund of Industrial Robot Application of Fujian University Engineering Research Center (Minjiang University) (MJUKF-IRA1902)

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    深度神经网络(deep neural network,简称DNN)量化是一种高效的模型压缩方法,使用少量位宽表示模型计算过程中的参数和中间结果数据.数据位宽会直接影响内存占用、计算效率和能耗.以往的模型量化研究缺乏有效的定量分析,这导致量化损失难以预测.提出了一种超低损失的DNN量化方法(ultra-low loss quantization,简称μL2Q),以揭示量化位宽与量化损失之间的内在联系,指导量化位宽选择并降低量化损失.首先,将原始数据映射为标准正态分布的数据;然后,在等宽的量化区间中搜索最优量化参数;最后,将μL2Q方法融合进DNN的训练过程,并嵌入到主流的机器学习框架Caffe及Keras中,以支撑端到端模型压缩的设计和训练.实验结果表明,与最新的研究方法相比,在相同的位宽条件下,mL2Q方法能够保证更高的模型精度,在典型的神经网络模型上精度分别提高了1.94%,3.73%和8.24%.显著性物体检测实验结果表明,μL2Q方法能够胜任复杂的计算机视觉任务.

    Abstract:

    Deep neural network (DNN) quantization is an efficient model compression method, in which parameters and intermediate results are expressed by low bit width. The bit width of data will directly affect the memory footprint, computing power and energy consumption. Previous researches on model quantization lack effective quantitative analysis, which leads to unpredictable quantization loss of these methods. This study proposes an ultra-low loss quantization (μL2Q) method for DNN compression, which reveals the internal relationship between quantization bit width and quantization loss, effectively guiding the selection of quantization bit width and reducing quantization loss. First, the original data is mapped to the data with standard normal distribution and then the optimal parameter configuration is sought to reduce the quantization loss under the target bit width. Finally, μL2Q has been encapsulated and integrated into two popular deep learning training frameworks, including Caffe and Keras, to support the design and training of end-to-end model compression. The experimental results show that compared with the state-of-the-art three clusters of quantization solutions, μL2Q can still guarantee the accuracy and deliver 1.94%, 3.73%, and 8.24% of accuracy improvements under the typical neural networks with the same quantization bit width, respectively. In addition, it is also verified that μL2Q can be competent for more complex computer vision tasks through salient object detection experiments.

    参考文献
    相似文献
    引证文献
引用本文

龚成,卢冶,代素蓉,刘方鑫,陈新伟,李涛.一种超低损失的深度神经网络量化压缩方法.软件学报,2021,32(8):2391-2407

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2020-07-21
  • 最后修改日期:2020-09-07
  • 录用日期:
  • 在线发布日期: 2021-02-07
  • 出版日期: 2021-08-06
文章二维码
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号