[关键词]
[摘要]
源代码的摘要可以帮助软件开发人员快速地理解代码, 帮助维护人员更快地完成维护任务. 但是, 手工编写摘要代价高、效率低, 因此人们试图利用计算机自动地为源代码生成摘要. 近年来, 基于神经网络的代码摘要技术成为自动源代码摘要研究的主流技术和软件工程领域的研究热点. 首先阐述了代码摘要的概念和自动代码摘要的定义, 回顾了自动代码摘要技术的发展历程, 并介绍了生成式摘要的质量评估方法和评估指标; 然后分析了神经代码摘要算法的通用结构、工作流程和面临的主要挑战; 给出了代表性算法的分类, 并对每类算法的设计原理、特点和限制条件进行了分析. 最后, 讨论并展望了未来神经代码摘要技术的发展趋势和研究方向.
[Key word]
[Abstract]
Source code summaries can help software developers comprehend programs faster and better, and assist maintenance developers in accomplishing their tasks efficiently. Since writing summaries by programmers is of high cost and low efficiency, researchers have tried to summarize source code automatically. In recent years, the technologies of neural network-based automatic summarization of source code have become the mainstream techniques of automatic source code summarization, and it is a hot research topic in the domain of intelligent software engineering. Firstly, this paper describes the concept of source code summarization and the definition of automatic source code summarization, presents its development history, and reviews the methods and metrics of the quality evaluation of the generated summaries. Then, it analyzes the general framework and the main challenges of neural network-based automatic code summarization algorithms. In addition, it focuses on the classification of representative algorithms, the design principle, characteristics, and restrictions of each category of algorithms. Finally, it discusses and looks forward to the trends on techniques of neural network-based source code summarization in future.
[中图分类号]
TP311
[基金项目]
国家重点研发计划(2019YFB1705902); 国家自然科学基金(61972013, 61932007); 教育部产学合作协同育人项目(201901195001)