Exploration and Improvement of Capabilities of LLMs in Code Refinement Task

doi:10.13328/j.cnki.jos.007325

微信服务号

微信订阅号

2025-6-30- 0

Home > Archive>Volume 36, Issue 6, 2025 >2477-2500. DOI:10.13328/j.cnki.jos.007325

PDF HTML XML Export Cite reminder

Exploration and Improvement of Capabilities of LLMs in Code Refinement Task
DOI:
                        10.13328/j.cnki.jos.007325
                    
Author:
                        WANG Zhi-PengWANG Zhi-Peng
Software Institute, Nanjing University, Nanjing 210093, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
HE Tie-KeHE Tie-Ke
Software Institute, Nanjing University, Nanjing 210093, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
ZHAO Ruo-YuZHAO Ruo-Yu
Software Institute, Nanjing University, Nanjing 210093, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
ZHENG TaoZHENG Tao
Software Institute, Nanjing University, Nanjing 210093, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site

                    
Affiliation:
Clc Number:TP311
Fund Project:

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

As a crucial part of automated code review, the code refinement task is of great significance for improving development efficiency and code quality. Since large language models (LLMs) have shown far better performance than traditional small-scale pre-trained models in the field of software engineering, this study aims to explore the performance of these two types of models in the task of automatic code refinement, so as to evaluate the comprehensive advantages of LLMs. The traditional code quality evaluation metrics (e.g., BLEU, CodeBLEU, edit progress) are used to evaluate the performance of four mainstream LLMs and four representative small-scale pre-trained models in the code refinement task. Findings indicate that the refinement quality of LLMs in the pre-review code refinement subtask is inferior to that of small-scale pre-trained models. Due to the difficulty of the existing code quality evaluation metrics in explaining the above phenomenon, this study proposes Unidiff-based code refinement evaluation metrics to quantify the change operations in the refinement process, in order to explain the reasons for the inferiority and reveal the tendency of the models to perform change operations: (1) The pre-review code refinement task is rather difficult, the accuracy of the models in performing correct change operations is extremely low, and LLMs are more “aggressive” than small-scale pre-trained models, that is, they tend to perform more code change operations, resulting in their poor performance; (2) Compared with small-scale pretrained models, LLMs tend to perform more ADD and MODIFY change operations in the code refinement task, and the average number of inserted code lines in ADD change operations is larger, further proving their “aggressive” nature. To alleviate the disadvantages of LLMs in the pre-review refinement task, this study introduces the LLM-Vote method based on LLMs and ensemble learning, which includes two sub-schemes: Inference-based and Confidence-based, aiming to integrate the advantages of different base models to improve the code refinement quality. On this basis, a refinement determination mechanism is further introduced to enhance the decision stability and reliability of the model. Experimental results demonstrate that the Confidence-based LLM-Voter method significantly increases the exact match (EM) value and obtains a refinement quality better than all base models, thus effectively alleviating the disadvantages of large language models.

Key words:code review;automated code refinement;large language model (LLM);unified diff format (Unidiff);ensemble learning

Get Citation

王志鹏,何铁科,赵若愚,郑滔.大语言模型在代码优化任务中的能力探究及改进方法.软件学报,2025,36(6):2477-2500

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:August 25,2024
Revised:October 14,2024
Adopted:
Online: December 10,2024
Published:

You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address：4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code：100190
Phone：010-62562563 Fax：010-62562533 Email：jos@iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063

微信服务号

微信订阅号

Get Citation

Share

微信扫一扫：分享

Article Metrics

History