[关键词]
[摘要]
随着训练可用数据量的增长与计算平台处理能力的增强,基于深度学习的智能模型能够完成越来越复杂的任务,其在计算机视觉、自然语言处理等人工智能领域已经取得重大的突破.然而,这些深度模型具有庞大的参数规模,与此相伴的可畏的计算开销与内存需求使其在计算能力受限平台(例如移动嵌入式设备)的部署中遇到了巨大的困难与挑战.因此,如何在不影响深度学习模型性能的情况下进行模型压缩与加速成为研究热点.首先对国内外学者提出的经典深度学习模型压缩与加速方法进行分析,从参数剪枝、参数量化、紧凑网络、知识蒸馏、低秩分解、参数共享和混合方式这7个方面分类总结;其次,总结对比几种主流技术的代表性方法在多个公开模型上的压缩与加速效果;最后,对于模型压缩与加速领域的未来研究方向加以展望.
[Key word]
[Abstract]
With the development of the amount of data available for training and the processing power of new computing platform, the intelligent model based on deep learning can accomplish more and more complex tasks, and it has made major breakthroughs in the field of AI such as computer vision and natural language processing. However, the large number of parameters of these deep models bring awesome computational overhead and memory requirements, which makes the big models must face great difficulties and challenges in the deployment of computing-capable platforms (such as mobile embedded devices). Therefore, model compression and acceleration without affecting the performance have become a research hotspot. This study first analyzes the classical deep learning model compression and acceleration methods proposed by domestic and international scholars, and summarize seven aspects:Parameter pruning, parameter quantization, compact network, knowledge distillation, low-rank decomposition, parameter sharing, and hybrid methods. Secondly, the compression and acceleration performance of several mainstream representative methods is compared on multiple public models. Finally, the future research directions in the field of model compression and acceleration are discussed.
[中图分类号]
[基金项目]
国家自然科学基金(61872180,61872176);江苏省“双创计划”;江苏省“六大人才高峰”高层次人才项目(B类);蚂蚁金服科研基金;中央高校基本科研业务费专项资金(14380069)