面向复杂头文件的自动化分解与重构方法
CSTR:
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

TP311

基金项目:

国家重点研发计划(2023YFB4503803)


Automated Approach for Decomposing and Refactoring God Header Files
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    许多代码文件随着软件演化逐渐膨胀并承担了过多职责, 严重影响了软件的可维护性和可理解性. 开发者常需要重构这些文件, 将一个大的代码文件分解成多个较小的子文件. 现有研究工作主要聚焦类文件的分解重构, 并不完全适用于分解复杂头文件. 这是因为分解头文件面临一些独有的挑战: 既需要考虑整个软件项目的构建依赖以降低编译成本, 也需要确保分解后的子文件之间不会存在循环依赖. 为此, 提出了一种面向复杂头文件的自动化分解与重构方法——HeaderSplit. 该方法首先为复杂头文件构造蕴含多种代码关系的代码元素图, 其中就包括体现项目构建依赖的共同使用关系; 然后通过节点合并与多视图聚类算法识别关联紧密的代码元素聚类; 随后引入启发式的循环依赖修正算法生成可行的文件分解方案. 用户确认分解方案后, HeaderSplit能够自动执行重构, 生成新的子文件内容, 并更新软件项目内直接或间接引用原头文件的代码语句. 在合成复杂头文件与真实复杂头文件上对HeaderSplit进行评估, 结果表明: 1) HeaderSplit在准确率上比现有方法提升了11.5%, 并且具有更强的跨软件项目稳定性; 2) HeaderSplit分解得到的子文件模块度更高且无循环依赖, 具有更好的架构设计; 3) 使用HeaderSplit分解复杂头文件可以降低其演化历史中15%–60%的重编译成本; 4) HeaderSplit可以高效实施自动化重构, 在5 min以内完成百万行软件项目内的头文件分解重构, 具有很高的实用价值.

    Abstract:

    Many code files become oversized and take on excessive responsibilities as software evolves, which severely affects software maintainability and comprehensibility. Developers often need to refactor such files by decomposing a large code file into several smaller ones. Existing studies mainly focus on class file decomposition and are not fully applicable to decomposing complex header files. This is because header file decomposition faces unique challenges. It needs to consider the build dependencies of the entire software project to reduce compilation cost and ensure that the decomposed files are free of cyclic dependencies. To address these challenges, this study proposes an automated approach for decomposing and refactoring complex header files, HeaderSplit. It first constructs a code element graph that captures multiple types of code relationships, including co-usage relationships that reflect project build dependencies. Then, a node coarsening process and a multi-view graph clustering algorithm are applied to identify clusters of closely related code elements. A heuristic algorithm is further introduced to eliminate cyclic dependencies in the clustering results. After the decomposition plan is confirmed, HeaderSplit automatically performs the refactoring, generating new sub-header files and updating the include statements in all code files that directly or indirectly include the original header file. HeaderSplit is evaluated on both synthetic and real complex header files. The results are as follows. 1) HeaderSplit improves accuracy by 11.5% compared with existing methods and demonstrates higher cross-project stability. 2) The decomposed sub-files have higher Modularity and no cyclic dependencies, indicating better architectural design. 3) Using HeaderSplit to decompose complex header files can reduce recompilation costs in their evolution history by15%–60%. 4) HeaderSplit efficiently performs automated refactoring, completing the decomposition and refactoring of header files in large-scale software projects with millions of lines of code within five minutes, showing high practical value.

    参考文献
    相似文献
    引证文献
引用本文

王玥,孙嘉旋,邹艳珍,李宇轩,常文辉,谢冰.面向复杂头文件的自动化分解与重构方法.软件学报,,():1-19

复制
相关视频

分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2025-04-03
  • 最后修改日期:2025-06-05
  • 录用日期:
  • 在线发布日期: 2026-01-07
  • 出版日期:
文章二维码
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号