Large Language Model-Based Decomposition of Long Methods

doi:10.13328/j.cnki.jos.007329

微信服务号

微信订阅号

2025-4-24- 15

Home > Archive>Volume 36, Issue 6, 2025 >2501-2514. DOI:10.13328/j.cnki.jos.007329

PDF HTML XML Export Cite reminder

Large Language Model-Based Decomposition of Long Methods
DOI:
                        10.13328/j.cnki.jos.007329
                    
Author:
                        XU Zi-MaoXU Zi-Mao
School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
JIANG Yan-JieJIANG Yan-Jie
School of Computer Science, Peking University, Beijing 100091, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
ZHANG Yu-XiaZHANG Yu-Xia
School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
LIU HuiLIU Hui
School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site

                    
Affiliation:
Clc Number:TP311
Fund Project:

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

Long methods, along with other types of code smells, prevent software applications from reaching their optimal readability, reusability, and maintainability. Consequently, automated detection and decomposition of long methods have been widely studied. Although these approaches have significantly facilitated the decomposition, their solutions often differ significantly from the optimal ones. To address this, the automatable portion of the publicly available dataset containing real-world long methods is investigated. Based on the findings of this investigation, a new method (called Lsplitter) based on large language models (LLMs) is proposed in this study for automatically decomposing long methods. For a given long method, the Lsplitter decomposes the method into a series of shorter methods according to heuristic rules and LLMs. However, LLMs often split out similar methods. In response to the decomposition results of LLMs, Lsplitter utilizes a location-based algorithm to merge physically contiguous and highly similar methods into a longer method. Finally, these candidate results are ranked. Experiments are conducted on 2 849 long methods in real Java projects. The experimental results show that compared with the traditional methods combined with a modularity matrix, the hit rate of Lsplitter is improved by 142%, and compared with the methods purely based on LLMs, the hit rate is improved by 7.6%.

Key words:long method;refactoring;LLM;decomposition

Get Citation

徐子懋,姜艳杰,张宇霞,刘辉.基于大语言模型的长方法分解.软件学报,2025,36(6):2501-2514

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:August 26,2024
Revised:October 14,2024
Adopted:
Online: December 10,2024
Published:

You are the first2038111Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address：4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code：100190
Phone：010-62562563 Fax：010-62562533 Email：jos@iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063

微信服务号

微信订阅号

Get Citation

Share

微信扫一扫：分享

Article Metrics

History