代码自然性及其应用研究进展

doi:10.13328/j.cnki.jos.006355

微信服务号

微信订阅号

2025年6月16日 8:21 星期一

首页 > 过刊浏览>2022年第33卷第8期 >3015-3034. DOI:10.13328/j.cnki.jos.006355

PDF HTML阅读 XML下载导出引用引用提醒

代码自然性及其应用研究进展
DOI:
                        10.13328/j.cnki.jos.006355
                    
CSTR:
                        
                    
作者:
                        陈浙哲陈浙哲
信息物理社会可信服务计算教育部重点实验室(重庆大学), 重庆 400044;重庆大学 大数据与软件学院, 重庆 401331
在期刊界中查找
在百度中查找
在本站中查找
鄢萌鄢萌
信息物理社会可信服务计算教育部重点实验室(重庆大学), 重庆 400044;重庆大学 大数据与软件学院, 重庆 401331
在期刊界中查找
在百度中查找
在本站中查找
夏鑫夏鑫
Faculty of Information Technology, Monash University, Melbourne, VIC 3800, Australia
在期刊界中查找
在百度中查找
在本站中查找
刘忠鑫刘忠鑫
浙江大学 计算机科学与技术学院, 浙江 杭州 310007
在期刊界中查找
在百度中查找
在本站中查找
徐洲徐洲
信息物理社会可信服务计算教育部重点实验室(重庆大学), 重庆 400044;重庆大学 大数据与软件学院, 重庆 401331
在期刊界中查找
在百度中查找
在本站中查找
雷晏雷晏
信息物理社会可信服务计算教育部重点实验室(重庆大学), 重庆 400044;重庆大学 大数据与软件学院, 重庆 401331
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:陈浙哲(1997－),女,学士,主要研究领域为智能软件工程,软件仓库挖掘;
刘忠鑫(1994－),男,CCF专业会员,主要研究领域为智能化软件工程,软件文档自动生成;
鄢萌(1989－),男,博士,研究员,博士生导师,CCF专业会员,主要研究领域为智能软件工程,软件仓库挖掘,软件维护与演化;
徐洲(1990－),男,博士,助理研究员,CCF专业会员,主要研究领域为软件仓库挖掘,软件缺陷预测;
夏鑫(1986－),男,博士,讲师,博士生导师,CCF专业会员,主要研究领域为软件仓库挖掘,经验软件工程;
雷晏(1985－),男,博士,副教授,CCF专业会员,主要研究领域为软件错误定位,软件自动修复.
通讯作者:鄢萌,E-mail:mengy@cqu.edu.cn
中图分类号:
基金项目:国家自然科学基金(62002034);中央高校基本科研业务费(2020CDCGRJ072,2020CDJQYA021,2021CDJKYJH032);国防基础科研计划(WDZC20205500308);中国博士后基金(2020M673137);重庆市自然科学基金(cstc2020jcyj-bshX0114)

Research Progress of Code Naturalness and Its Application

Author:

CHEN Zhe-Zhe
CHEN Zhe-Zhe
Key Laboratory of Dependable Service Computing in Cyber Physical Society (Chongqing University), Ministry of Education, Chongqing 400044, China;School of Big Data and Software Engineering, Chongqing University, Chongqing 401331, China
在期刊界中查找
在百度中查找
在本站中查找
YAN Meng
YAN Meng
Key Laboratory of Dependable Service Computing in Cyber Physical Society (Chongqing University), Ministry of Education, Chongqing 400044, China;School of Big Data and Software Engineering, Chongqing University, Chongqing 401331, China
在期刊界中查找
在百度中查找
在本站中查找
XIA Xin
XIA Xin
Faculty of Information Technology, Monash University, Melbourne, VIC 3800, Australia
在期刊界中查找
在百度中查找
在本站中查找
LIU Zhong-Xin
LIU Zhong-Xin
College of Computer Science and Technology, Zhejiang University, Hangzhou 310007, China
在期刊界中查找
在百度中查找
在本站中查找
XU Zhou
XU Zhou
Key Laboratory of Dependable Service Computing in Cyber Physical Society (Chongqing University), Ministry of Education, Chongqing 400044, China;School of Big Data and Software Engineering, Chongqing University, Chongqing 401331, China
在期刊界中查找
在百度中查找
在本站中查找
LEI Yan
LEI Yan
Key Laboratory of Dependable Service Computing in Cyber Physical Society (Chongqing University), Ministry of Education, Chongqing 400044, China;School of Big Data and Software Engineering, Chongqing University, Chongqing 401331, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

代码自然性(code naturalness)研究是自然语言处理领域和软件工程领域共同的研究热点之一,旨在通过构建基于自然语言处理技术的代码自然性模型,以解决各种软件工程任务.近年来,随着开源软件社区中源代码和数据规模的不断扩大,越来越多的研究人员注重钻研源代码中蕴藏的信息,并且取得了一系列研究成果.但与此同时,代码自然性研究在代码语料库构建、模型构建和任务应用等环节面临许多挑战.鉴于此,从代码自然性技术的代码语料库构建、模型构建和任务应用等方面对近年来代码自然性研究及应用进展进行梳理和总结.主要内容包括:(1)介绍了代码自然性的基本概念及其研究概况;(2)归纳目前代码自然性研究的语料库,并对代码自然性模型建模方法进行分类与总结;(3)总结代码自然性模型的实验验证方法和模型评价指标;(4)总结并归类了目前代码自然性的应用现状;(5)归纳代码自然性技术的关键问题;(6)展望代码自然性技术的未来发展.

关键词:代码自然性;软件仓库挖掘;代码语言模型

Abstract:

The study of code naturalness is one of the common research hotspots in the field of natural language processing and software engineering, aiming to solve various software engineering tasks by building a code naturalness model based on natural language processing techniques. In recent years, as the size of source code and data in the open source software community continues to grow, more and more researchers are focusing on the information contained in the source code, and a series of research results have been achieved. While at the same time, code naturalness research faces many challenges in code corpus construction, model building, and task application. In view of this, this paper reviews and summarizes the progress of code naturalness research and application in recent years in terms of code corpus construction, model construction, and task application. The main contents include:(1) Introducing the basic concept of code naturalness and its research overview; (2) The current corpus of code naturalness research is summarized, and the modeling methods for code naturalness are classified and summarized; (3) Summarizing the experimental validation methods and model evaluation metrics of code naturalness models; (4) Summarizing and categorizing the current application status of code naturalness; (5) Summarizing the key issues of code naturalness techniques; (6) Prospecting the future development of code naturalness techniques.

Key words:code naturalness;mining software repositories;code language model

引用本文

陈浙哲,鄢萌,夏鑫,刘忠鑫,徐洲,雷晏.代码自然性及其应用研究进展.软件学报,2022,33(8):3015-3034

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2021-01-29
最后修改日期:2021-04-14
录用日期:
在线发布日期: 2021-05-21
出版日期: 2022-08-06

微信服务号

微信订阅号

引用本文

相关视频

分享

文章指标

历史

文章二维码

微信服务号

微信订阅号

引用本文

相关视频

分享

微信扫一扫：分享

文章指标

历史

文章二维码