LIU Zhong-Xin
College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, ChinaTANG Zhi-Jie
College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, ChinaXIA Xin
Software Engineering Application Technology Lab, Huawei Technologies Co. Ltd., Hangzhou 310007, ChinaLI Shan-Ping
College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, ChinaCode change is a kind of key behavior in software evolution, and its quality has a large impact on software quality. Modeling and representing code changes is the basis of many software engineering tasks, such as just-in-time defect prediction and recovery of software product traceability. The representation learning technologies for code changes have attracted extensive attention and have been applied to diverse applications in recent years. This type of technology targets at learning to represent the semantic information in code changes as low-dimensional dense real-valued vectors, namely, learning the distributed representation of code changes. Compared with the conventional methods of manually designing code change features, such technologies offers the advantages of automatic learning, end-to-end training, and accurate representation. However, this field is still faced with some challenges, such as great difficulties in utilizing structural information and the absence of benchmark datasets. This study surveys and summarizes the recent progress of studies and applications of representation learning technologies for code changes, and it mainly consists of the following four parts. (1) The study presents the general framework of representation learning of code changes and its application. (2) Subsequently, it reviews the currently available representation learning technologies for code changes and summarizes their respective advantages and disadvantages. (3) Then, the downstream applications of such technologies are summarized and classified. (4) Finally, this study discusses the challenges and potential opportunities ahead of representation learning technologies for code changes and suggests the directions for the future development of this type of technology.
刘忠鑫,唐郅杰,夏鑫,李善平.代码变更表示学习及其应用研究进展.软件学报,2023,34(12):5501-5526
Copy