深度网络模型压缩综述
作者:
作者简介:

雷杰(1991-),男,湖北仙桃人,博士生,主要研究领域为计算机视觉,深度学习;王兴路(1996-),男,本科生,主要研究领域为计算机视觉,深度学习;高鑫(1992-),男,硕士生,主要研究领域为计算机视觉,深度学习;宋明黎(1976-),男,博士,教授,博士生导师,CCF专业会员,主要研究领域为计算机视觉,深度学习;宋杰(1991-),男,博士生,主要研究领域为计算机视觉,深度学习.

通讯作者:

宋明黎,E-mail:brooksong@zju.edu.cn

基金项目:

国家自然科学基金(61572428,U1509206)


Survey of Deep Neural Network Model Compression
Author:
Fund Project:

National Natural Science Foundation of China (61572428, U1509206)

  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [57]
  • |
  • 相似文献 [20]
  • |
  • 引证文献
  • | |
  • 文章评论
    摘要:

    深度网络近年来在计算机视觉任务上不断刷新传统模型的性能,已逐渐成为研究热点.深度模型尽管性能强大,然而由于参数数量庞大、存储和计算代价高,依然难以部署在受限的硬件平台上(如移动设备).模型的参数在一定程度上能够表达其复杂性,相关研究表明,并不是所有的参数都在模型中发挥作用,部分参数作用有限、表达冗余,甚至会降低模型的性能.首先,对国内外学者在深度模型压缩上取得的成果进行了分类整理,依此归纳了基于网络剪枝、网络精馏和网络分解的方法;随后,总结了相关方法在多种公开深度模型上的压缩效果;最后,对未来的研究可能的方向和挑战进行了展望.

    Abstract:

    Deep neural networks have continually surpassed traditional methods on a variety of computer vision tasks. Though deep neural networks are very powerful, the large number of weights consumes considerable storage and calculation time, making it hard to deploy on resource-constrained hardware platforms such as mobile system. The number of weights in deep neural networks represents the complexity to an extent, but not all the weights contribute to the performance according to recent researches. Specifically, some weights are redundant and even decrease the performance. This survey offers a systematic summarization of existing research achievements of the domestic and foreign researchers in recent years in the aspects of network pruning, network distillation, and network decomposition. Furthermore, comparisons of compression performance are provided on several public deep neural networks. Finally, a perspective of future work and challenges in this research area are discussed.

    参考文献
    [1] Le Cun Y, Bengio Y, Hinton G. Deep learning. Nature, 2015,521(7553):436-444.[doi:10.1038/nature14539]
    [2] Plamondon R, Srihari SN. Online and off-line handwriting recognition:A comprehensive survey. IEEE Trans. on Pattern Analysis and Machine Intelligence, 2000,22(1):63-84.[doi:10.1109/34.824821]
    [3] Wan J, Wang D, Hoi SCH, Wu P, Zhu J, Zhang Y, Li J. Deep learning for content-based image retrieval:A comprehensive study. In:Proc. of the 22nd ACM Int'l Conf. on Multimedia (MM). Orlando:ACM Press, 2014. 157-166.[doi:10.1145/2647868.2654948]
    [4] Girshick R. Fast r-cnn. In:Proc. of the IEEE Int'l Conf. on Computer Vision (ICCV). Santiago:IEEE, 2015. 1440-1448.[doi:10.1109/iccv.2015.169]
    [5] Wang N, Yeung DY. Learning a deep compact image representation for visual tracking. In:Proc. of the Advances in Neural Information Processing Systems (NIPS). Tahoe:IEEE, 2013. 809-817.
    [6] Severyn A, Moschitti A. Learning to rank short text pairs with convolutional deep neural networks. In:Proc. of the 38th Int'l ACM SIGIR Conf. on Research and Development in Information Retrieval. Santiago:ACM Press, 2015. 373-382.[doi:10.1145/2766462. 2767738]
    [7] Ngiam J, Coates A, Lahiri A, Prochnow B, Ng AY. On optimization methods for deep learning. In:Proc. of the 28th Int'l Conf. on Machine Learning (ICML). Bellevue:ACM Press, 2011. 265-272.
    [8] Krizhevsky A, Sutskever I, Hinton GE. Imagenet classification with deep convolutional neural networks. In:Proc. of the Advances in Neural Information Processing Systems (NIPS). Tahoe:IEEE, 2012. 1097-1105.
    [9] Sercu T, Puhrsch C, Kingsbury B, Le Cun Y. Very deep multilingual convolutional neural networks for LVCSR. In:Proc. of the Acoustics, Speech and Signal Processing (ICASSP). Shanghai:IEEE, 2016. 4955-4959.[doi:10.1109/icassp.2016.7472620]
    [10] He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition (CVPR). Las Vegas:IEEE, 2016. 770-778.[doi:10.1109/cvpr.2016.90]
    [11] Setiono R, Liu H. Neural-Network feature selector. IEEE Trans. on Neural Networks, 1997,8(3):654-662.[doi:10.1109/72. 572104]
    [12] Hanson SJ, Pratt LY. Comparing biases for minimal network construction with back-propagation. In:Proc. of the Advances in Neural Information Processing Systems (NIPS). Denver:IEEE, 1989. 177-185.
    [13] Whitley D, Starkweather T, Bogart C. Genetic algorithms and neural networks:Optimizing connections and connectivity. Parallel Computing, 1990,14(3):347-361.[doi:10.1016/0167-8191(90)90086-o]
    [14] Oberman SF, Flynn MJ. Design issues in division and other floating-point operations. IEEE Trans. on Computers, 1997,46(2):154-161.[doi:10.1109/12.565590]
    [15] Anwar S, Sung WY. Coarse pruning of convolutional neural networks with random masks. In:Proc. of the Int'l Conf. on Learning and Representation (ICLR). IEEE, 2017. 134-145.
    [16] Le Cun Y, Denker JS, Solla SA. Optimal brain damage. In:Proc. of the Advances in Neural Information Processing Systems (NIPS). Denver:IEEE, 1989. 598-605.
    [17] Rosenblueth E. Point estimates for probability moments. Proc. of the National Academy of Sciences, 1975,72(10):3812-3814.[doi:10.1073/pnas.72.10.3812]
    [18] Hassibi B, Stork DG, Wolff GJ. Optimal brain surgeon and general network pruning. In:Proc. of the Int'l Conf. on Neural Networks (ICNN). San Francisco:IEEE, 1993. 293-299.[doi:10.1109/icnn.1993.298572]
    [19] Hassibi B, Stork DG. Second order derivatives for network pruning:Optimal brain surgeon. In:Proc. of the Advances in Neural Information Processing Systems (NIPS). Denver:IEEE, 1993. 164-171.
    [20] Srinivas S, Babu RV. Data-Free parameter pruning for deep neural networks. In:Proc. of the 26th British Machine Vision Conf. (BMVC). Swansea:IEEE, 2015. 120-129.[doi:10.5244/c.29.31]
    [21] Han S, Mao H, Dally WJ. Deep compression:Compressing deep neural networks with pruning, trained quantization and Huffman coding. In:Proc. of the Int'l Conf. on Learning and Representation (ICLR). San Juan:IEEE, 2016. 233-242.
    [22] Han S, Pool J, Tran J, Dally WJ. Learning both weights and connections for efficient neural network. In:Proc. of the Advances in Neural Information Processing Systems. Montreal:IEEE, 2015. 1135-1143.
    [23] Anwar S, Hwang K, Sung W. Structured pruning of deep convolutional neural networks. ACM Journal on Emerging Technologies in Computing Systems (JETC), 2017,13(3):Article No.32.[doi:10.1145/3005348]
    [24] Li H, Kadav A, Durdanovic I, Samet H, Graf HP. Pruning filters for efficient ConvNets. In:Proc. of the Int'l Conf. on Learning and Representation (ICLR). IEEE, 2017. 34-42.
    [25] Polyak A, Wolf L. Channel-Level acceleration of deep face representations. IEEE Access, 2015,3:2163-2175.[doi:10.1109/access. 2015.2494536]
    [26] Figurnov M, Ibraimova A, Vetrov DP, Kohli P. PerforatedCNNs:Acceleration through elimination of redundant convolutions. In:Proc. of the Advances in Neural Information Processing Systems (NIPS). Barcelona:IEEE, 2016. 947-955.
    [27] Hu H, Peng R, Tai YW, Tang CK. Network trimming:A data-driven neuron pruning approach towards efficient deep architectures. In:Proc. of the Int'l Conf. on Learning and Representation (ICLR). IEEE, 2017. 214-222.
    [28] Molchanov P, Tyree S, Karras T, Aila T, Kautz J. Pruning convolutional neural networks for resource efficient transfer learning. In:Proc. of the Int'l Conf. on Learning and Representation (ICLR). IEEE, 2017. 324-332.
    [29] Rueda FM, Grzeszick R, Fink GA. Neuron pruning for compressing deep networks using maxout architectures. In:Proc. of the German Conf. on Pattern Recognition (GCPR). Saarbrücken:Springer-Verlag, 2017. 110-120.[doi:10.1007/978-3-319-66709-6_15]
    [30] Zhou ZH. Rule extraction:Using neural networks or for neural networks? Journal of Computer Science and Technology, 2004, 19(2):249-253.[doi:10.1007/BF02944803]
    [31] Zhou ZH, Jiang Y. NeC4.5:Neural ensemble based C4.5. IEEE Trans. on Knowledge and Data Engineering, 2004,16(6):770-773.
    [32] Buciluǎ C, Caruana R, Niculescu-Mizil A. Model compression. In:Proc. of the 12th ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining. Philadelphia:ACM Press, 2006. 535-541.[doi:10.1145/1150402.1150464]
    [33] Hinton G, Vinyals O, Dean J. Distilling the knowledge in a neural network. In:Proc. of the Advances in Neural Information Processing Systems (NIPS). Montrea:IEEE, 2014. 2644-2652.
    [34] Pan SJ, Yang Q. A survey on transfer learning. IEEE Trans. on Knowledge and Data Engineering, 2010,22(10):1345-1359.[doi:10. 1109/TKDE.2009.191]
    [35] Lowd D, Domingos P. Naive Bayes models for probability estimation. In:Proc. of the 22nd Int'l Conf. on Machine Learning (ICML). Bonn:ACM Press, 2005. 529-536.[doi:10.1145/1102351.1102418]
    [36] Ba J, Caruana R. Do deep nets really need to be deep? In:Proc. of the Advances in Neural Information Processing Systems (NIPS). Montrea:IEEE, 2014. 2654-2662.
    [37] Romero A, Ballas N, Kahou SE, Chassang A, Gatta C, Bengio Y. Fitnets:Hints for thin deep nets. In:Proc. of the Int'l Conf. on Learning and Representation (ICLR). IEEE, 2017. 124-133.
    [38] Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A. Going deeper with convolutions. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition (CVPR). Boston:IEEE, 2015. 1-9.[doi:10.1109/cvpr.2015.7298594]
    [39] Chen T, Goodfellow I, Shlens J. Net2net:Accelerating learning via knowledge transfer. In:Proc. of the Int'l Conf. on Learning and Representation (ICLR). San Juan:IEEE, 2016. 27-35.
    [40] Li Z, Hoiem D. Learning without forgetting. In:Proc. of the European Conf. on Computer Vision (ECCV). Amsterdam:Springer Int'l Publishing, 2016. 614-629.[doi:10.1007/978-3-319-46493-0_37]
    [41] He ZF, Yang M, Liu HD. Joint learning of multi-label classification and label correlations. Ruan Jian Xue Bao/Journal of Software, 2014,25(9):1967-1981(in Chinese with English abstract). http://www.jos.org.cn/1000-9825/4634.htm[doi:10.13328/j.cnki.jos. 004634]
    [42] Golub GH, Reinsch C. Singular value decomposition and least squares solutions. Numerische Mathematik, 1970,14(5):403-420.[doi:10.1007/BF02163027]
    [43] Zhang M, Ge WH. Overlap bicuster algorithm based on probability. Computer Engineering and Design, 2012,33(9):3579-3583(in Chinese with English abstract).[doi:10.16208/j.issn1000-7024.2012.09.046]
    [44] Jaderberg M, Vedaldi A, Zisserman A. Speeding up convolutional neural networks with low rank expansions. In:Proc. of the 26th British Machine Vision Conf. (BMVC). Swansea:IEEE, 2015. 100-109.[doi:10.5244/c.28.88]
    [45] Denton EL, Zaremba W, Bruna J, Le Cun Y, Fergus R. Exploiting linear structure within convolutional networks for efficient evaluation. In:Proc. of the Advances in Neural Information Processing Systems (NIPS). Montrea:IEEE, 2014. 1269-1277.
    [46] Liu B, Wang M, Foroosh H, Tappen M, Penksy M. Sparse convolutional neural networks. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition (CVPR). Boston:IEEE, 2015. 806-814.[doi:10.1109/cvpr.2015.7298681]
    [47] Courbariaux M, Bengio Y, David JP. Binaryconnect:Training deep neural networks with binary weights during propagations. In:Proc. of the Advances in Neural Information Processing Systems (NIPS). Montreal:IEEE, 2015. 3123-3131.
    [48] Gong Y, Liu L, Yang M, Bourdev L. Compressing deep convolutional networks using vector quantization. In:Proc. of the Int'l Conf. on Learning and Representation (ICLR). Toronto:IEEE, 2015. 102-110.
    [49] Lee H, Battle A, Raina R, Ng AY. Efficient sparse coding algorithms. In:Proc. of the Advances in Neural Information Processing Systems (NIPS). IEEE, 2007. 789-801.
    [50] Mairal J, Bach F, Ponce J, Sapiro G. Online dictionary learning for sparse coding. In:Proc. of the 26th Annual Int'l Conf. on Machine Learning (ICML). Montreal:ACM Press, 2009. 689-696.[doi:10.1145/1553374.1553463]
    [51] Zhou A, Yao A, Guo Y, Xu L, Chen Y. Incremental network quantization:Towards lossless cnns with low-precision weights. In:Proc. of the Int'l Conf. on Learning and Representation (ICLR). IEEE, 2017. 154-162.
    [52] Monmasson E, Cirstea MN. FPGA design methodology for industrial control systems-A review. IEEE Trans. on Industrial Electronics, 2007,54(4):1824-1842.[doi:10.1109/tie.2007.898281]
    [53] Gupta S, Agrawal A, Gopalakrishnan K, Narayanan P. Deep learning with limited numerical precision. In:Proc. of the Int'l Conf. on Machine Learning (ICML). Lille:ACM Press, 2015. 1737-1746.
    [54] Antipov G, Berrani SA, Dugelay JL. Minimalistic CNN-based ensemble model for gender prediction from face images. Pattern Recognition Letters, 2016,70:59-65.[doi:10.1016/j.patrec.2015.11.011]
    附中文参考文献:
    [41] 何志芬,杨明,刘会东.多标记分类和标记相关性的联合学习.软件学报,2014,25(9):1967-1981. http://www.jos.org.cn/1000-9825/4634.htm[doi:10.13328/j.cnki.jos.004634]
    [43] 张敏,戈文航.基于概率计算的重叠双聚类算法.计算机工程与设计,2012,33(9):3579-3583.[doi:10.16208/j.issn1000-7024.2012. 09.046]
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

雷杰,高鑫,宋杰,王兴路,宋明黎.深度网络模型压缩综述.软件学报,2018,29(2):251-266

复制
分享
文章指标
  • 点击次数:6261
  • 下载次数: 16105
  • HTML阅读次数: 5171
  • 引用次数: 0
历史
  • 收稿日期:2017-05-02
  • 最后修改日期:2017-07-24
  • 在线发布日期: 2017-11-29
文章二维码
您是第19876377位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号