预训练模型在软件工程领域应用研究进展

doi:10.13328/j.cnki.jos.007143

微信服务号

微信订阅号

2025年3月17日 10:01 星期一

首页 > 过刊浏览>2025年第36卷第1期 >1-26. DOI:10.13328/j.cnki.jos.007143

PDF HTML阅读 XML下载导出引用引用提醒

预训练模型在软件工程领域应用研究进展
DOI:
                        10.13328/j.cnki.jos.007143
                    
CSTR:
                        32375.14.jos.007143
                    
作者:
                        宫丽娜宫丽娜
南京航空航天大学 计算机科学与技术学院, 江苏 南京 211106;高安全系统的软件开发与验证技术工信部重点实验室(南京航空航天大学), 江苏 南京 211106
在期刊界中查找
在百度中查找
在本站中查找
周易人周易人
南京航空航天大学 计算机科学与技术学院, 江苏 南京 211106;高安全系统的软件开发与验证技术工信部重点实验室(南京航空航天大学), 江苏 南京 211106
在期刊界中查找
在百度中查找
在本站中查找
乔羽乔羽
南京航空航天大学 计算机科学与技术学院, 江苏 南京 211106;高安全系统的软件开发与验证技术工信部重点实验室(南京航空航天大学), 江苏 南京 211106
在期刊界中查找
在百度中查找
在本站中查找
姜淑娟姜淑娟
中国矿业大学 计算机科学与技术学院, 江苏 徐州 221116
在期刊界中查找
在百度中查找
在本站中查找
魏明强魏明强
南京航空航天大学 计算机科学与技术学院, 江苏 南京 211106
在期刊界中查找
在百度中查找
在本站中查找
黄志球黄志球
南京航空航天大学 计算机科学与技术学院, 江苏 南京 211106;高安全系统的软件开发与验证技术工信部重点实验室(南京航空航天大学), 江苏 南京 211106
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:国家自然科学基金(62202223); 江苏省自然科学基金(BK20220881); 高安全系统的软件开发与验证技术工信部重点实验室(南京航空航天大学)开放项目(NJ2022027)

Research Progress of Pre-trained Model in Software Engineering

Author:

GONG Li-Na
GONG Li-Na
College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China;Key Laboratory for Safety-critical Software Development and Verification (Nanjing University of Aeronautics and Astronautics), Nanjing 211106, China
在期刊界中查找
在百度中查找
在本站中查找
ZHOU Yi-Ren
ZHOU Yi-Ren
College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China;Key Laboratory for Safety-critical Software Development and Verification (Nanjing University of Aeronautics and Astronautics), Nanjing 211106, China
在期刊界中查找
在百度中查找
在本站中查找
QIAO Yu
QIAO Yu
College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China;Key Laboratory for Safety-critical Software Development and Verification (Nanjing University of Aeronautics and Astronautics), Nanjing 211106, China
在期刊界中查找
在百度中查找
在本站中查找
JIANG Shu-Juan
JIANG Shu-Juan
School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116, China
在期刊界中查找
在百度中查找
在本站中查找
WEI Ming-Qiang
WEI Ming-Qiang
College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China
在期刊界中查找
在百度中查找
在本站中查找
HUANG Zhi-Qiu
HUANG Zhi-Qiu
College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China;Key Laboratory for Safety-critical Software Development and Verification (Nanjing University of Aeronautics and Astronautics), Nanjing 211106, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

近年来深度学习在软件工程领域任务中取得了优异的性能. 众所周知, 实际任务中优异性能依赖于大规模训练集, 而收集和标记大规模训练集需要耗费大量资源和成本, 这限制了深度学习技术在实际任务中的广泛应用. 随着深度学习领域预训练模型(pre-trained model, PTM)的发布, 将预训练模型引入到软件工程(software engineering, SE)任务中得到了国内外软件工程领域研究人员的广泛关注, 并得到了质的飞跃, 使得智能化软件工程进入了一个新时代. 然而, 目前没有研究提炼预训练模型在软件工程领域的成功和机遇. 为阐明这一交叉领域的工作 (pre-trained models for software engineering, PTM4SE), 系统梳理当前基于预训练模型的智能软件工程相关工作, 首先给出基于预训练模型的智能软件工程方法框架, 其次分析讨论软件工程领域常用的预训练模型技术, 详细介绍使用预训练模型的软件工程领域下游任务, 并比较和分析预训练模型技术这些任务上的性能. 然后详细介绍常用的训练和微调PTM的软件工程领域数据集. 最后, 讨论软件工程领域使用PTM面临的挑战和机遇. 同时将整理的软件工程领域PTM和常用数据集发布在https://github.com/OpenSELab/PTM4SE.

关键词:软件仓库挖掘;预训练模型;程序语言模型

Abstract:

In recent years, deep learning has achieved excellent performance in software engineering (SE) tasks. Excellent performance in practical tasks depends on large-scale training sets, and collecting and labeling large-scale training sets require a lot of resources and costs, which limits the wide application of deep learning techniques in practical tasks. With the release of pre-trained model (PTM) in the field of deep learning, researchers in SE have begun to pay attention to PTM and introduced PTM into SE tasks. PTM has made a qualitative leap in SE tasks, which makes intelligent software engineering enter a new era. However, none of the studies have refined the success, failure, and opportunities of pre-trained models in SE. To clarify the work in this cross-field (pre-trained models for software engineering, PTM4SE), this study systematically reviews the current studies related to PTM4SE. Specifically, the study first describes the framework of the intelligent software engineering methods based on pre-trained models and then analyzes the commonly used pre-trained models in SE. Meanwhile, it introduces the downstream tasks in SE with pre-trained models in detail and compares and analyzes the performance of pre-trained model techniques on these tasks. The study then presents the datasets used in SE for training and fine-tuning the PTMs. Finally, it discusses the challenges and opportunities for PTM4SE. The collated PTMs and datasets in SE are published athttps://github.com/OpenSELab/PTM4SE.

Key words:software repository mining;pre-trained model (PTM);programming language model

引用本文

宫丽娜,周易人,乔羽,姜淑娟,魏明强,黄志球.预训练模型在软件工程领域应用研究进展.软件学报,2025,36(1):1-26

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2023-02-06
最后修改日期:2023-06-21
录用日期:
在线发布日期: 2024-06-18
出版日期: 2025-01-06

微信服务号

微信订阅号

引用本文

分享

文章指标

历史

文章二维码

微信服务号

微信订阅号

引用本文

分享

微信扫一扫：分享

文章指标

历史

文章二维码