面向代码审查的细粒度代码变更溯源方法
作者:
作者简介:

王敏(1994-),男,博士,主要研究领域为软件工程,软件复用,代码审查;潘兴禄(1997-),男,硕士生,CCF学生会员,主要研究领域为软件工程,软件复用;邹艳珍(1976-),女,博士,副教授,CCF专业会员,主要研究领域为软件工程,软件复用,知识图谱,智能软件开发;谢冰(1970-),男,博士,教授,博士生导师,CCF高级会员,主要研究领域为软件工程,形式化方法,软件复用,智能软件开发.

通讯作者:

邹艳珍,E-mail:zouyz@pku.edu.cn

中图分类号:

TP311

基金项目:

国家自然科学基金(61972006)


Fine-grained Code Changes Tracking Approach for Code Review
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [41]
  • |
  • 相似文献 [20]
  • | | |
  • 文章评论
    摘要:

    代码审查是现代软件分布式并行开发过程中的重要机制. 在代码评审时, 帮助代码评审者快速查看某一段源代码的演化过程, 可以让评审者快速理解此段代码变更的原因和必要性, 从而有效提升代码评审的效率与质量. 现有工作虽然提供了一些类似的代码提交历史回溯方法及对应工具, 但缺乏从历史数据中进一步提取辅助代码评审相关辅助信息的能力. 为此, 提出一个面向代码评审的细粒度代码变更溯源方法C2Tracker. 给定一段方法(函数)级别的细粒度代码变更, C2Tracker能够自动追溯到历史开发过程中修改该段代码相关的代码提交, 并在此基础上进一步挖掘其中与该段代码频繁共现修改的代码元素以及相关的变更片段, 辅助代码评审者对当前代码变更的理解与决策. 在10个著名开源项目的数据集下进行实验验证. 实验结果表明, C2Tracker在追溯历史提交的准确率上达到97%, 在挖掘频繁共现代码元素任务上的准确率达到95%, 在追溯相关代码变更片段任务上的准确率达到97%; 相比现有评审方式, C2Tracker在具体案例的代码评审效率和质量上均有较大提升, 在绝大多数的代码评审案例中被评审者认为能提供“明显帮助”或“很大帮助”.

    Abstract:

    Code review is an important mechanism in the distributed development of modern software. In code review, providing the context information of the current changes can help code reviewers understand the evolution of a certain source code quickly, thereby enhancing the efficiency and quality of code review. Existing studies have provided some commit history tracking methods and corresponding tools, but these methods cannot further extract auxiliary information relevant to code review from historical data. Therefore, this study proposes a novel code change tracking approach for code review named C2Tracker. Given a fine-grained code change at the method (function) level, C2Tracker can automatically track the history commits which are related to the code changes. Furthermore, the frequent co-occurrence changed code elements and relevant code changes are mined to help reviewers understand the current code changes and make decisions. Experimental verification is conducted on ten well-known open-source projects. The results show that the accuracy of C2Tracker in tracking historical commits, mining frequent co-occurrence code elements, and tracking related code change fragments are 97%, 95%, and 97%, respectively. Compared with existing review methods, C2Tracker greatly improves its code review efficiency and quality in specific cases. Additionally, reviewers acknowledge that it can play a significant role in helping improve the efficiency and quality of most review cases.

    参考文献
    [1] Bacchelli A, Bird C. Expectations, outcomes, and challenges of modern code review. In: Proc. of the 35th Int’l Conf. on Software Engineering. San Francisco: IEEE, 2013. 712−721.
    [2] 尹刚, 王涛, 刘冰珣, 周明辉, 余跃, 李志星, 欧阳建权, 王怀民. 面向开源生态的软件数据挖掘技术研究综述. 软件学报, 2018, 29(8): 2258-2271. http://www.jos.org.cn/1000-9825/5524.htm
    Yin G, Wang T, Liu BX, Zhou MH, Yu Y, Li ZX, Ouyang JQ, Wang HM. Survey of software data mining for open source ecosystem. Ruan Jian Xue Bao/Journal of Software, 2018, 29(8): 2258-2271 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/5524.htm
    [3] Higo Y, Hayashi S, Kusumoto S. On tracking java methods with git mechanisms. Journal of Systems and Software, 2020, 165: 110571. [doi: 10.1016/j.jss.2020.110571]
    [4] Grund F, Chowdhury SA, Bradley NC, Hall B, Holmes R. CodeShovel: Constructing method-level source code histories. In: Proc. of the 43rd IEEE/ACM Int’l Conf. on Software Engineering. Madrid: IEEE, 2021. 1510-1522.
    [5] Palomba F, Bavota G, Di Penta M, Oliveto R, De Lucia A, Poshyvanyk D. Detecting bad smells in source code using change history information. In: Proc. of the 28th IEEE/ACM Int’l Conf. on Automated Software Engineering. Silicon Valley: IEEE, 2013. 268-278.
    [6] Arnold RS. Software Change Impact Analysis. Washington: IEEE Computer Society Press, 1996.
    [7] Falleri JR, Morandat F, Blanc X, Martinez M, Monperrus M. Fine-grained and accurate source code differencing. In: Proc. of the 29th ACM/IEEE Int’l Conf. on Automated Software Engineering. Vasteras: ACM, 2014. 313-324.
    [8] Han JW, Cheng H, Xin D, Yan XF. Frequent pattern mining: Current status and future directions. Data Mining and Knowledge Discovery, 2007, 15(1): 55-86. [doi: 10.1007/s10618-006-0059-1]
    [9] Myers EW. AnO(ND) difference algorithm and its variations. Algorithmica, 1986, 1(1): 251-266. [DOI: 10.1007/BF01840446]
    [10] Wuensch KL. What is a Likert scale? And how do you pronounce ‘Likert?’ Technical Report, 2005-10-04/2009-04-30, East Carolina University.
    [11] Hata H, Mizuno O, Kikuno T. Historage: Fine-grained version control system for Java. In: Proc. of the 12th Int’l Workshop on Principles of Software Evolution and the 7th Annual ERCIM Workshop on Software Evolution. Szeged: ACM, 2011. 96-100.
    [12] Maruyama K, Kitsu E, Omori T, Hayashi S. Slicing and replaying code change history. In: Proc. of the 27th IEEE/ACM Int’l Conf. on Automated Software Engineering. Essen: ACM, 2012. 246-249.
    [13] Servant F, Jones JA. History slicing: Assisting code-evolution tasks. In: Proc. of the 20th ACM SIGSOFT Int’l Symp. on the Foundations of Software Engineering. Cary: ACM, 2012. 43.
    [14] Servant F, Jones JA. Fuzzy fine-grained code-history analysis. In: Proc. of the 39th IEEE/ACM Int’l Conf. on Software Engineering. Buenos Aires: IEEE, 2017. 746-757.
    [15] Li Y, Zhu CG, Rubin J, Chechik M. Semantic slicing of software version histories. IEEE Transactions on Software Engineering, 2018, 44(2): 182-201. [doi: 10.1109/TSE.2017.2664824]
    [16] Li Y, Rubin J, Chechik M. Semantic slicing of software version histories (T). In: Proc. of the 30th IEEE/ACM Int’l Conf. on Automated Software Engineering. Lincoln: IEEE, 2015. 686-696.
    [17] Li Y, Zhu CG, Rubin J, Chechik M. Precise semantic history slicing through dynamic delta refinement. In: Proc. of the 31st IEEE/ACM Int’l Conf. on Automated Software Engineering. Singapore: IEEE, 2016. 495-506.
    [18] Zimmermann T, Weibgerber P, Diehl S, Zeller A. Mining version histories to guide software changes. In: Proc. of the 26th Int’l Conf. on Software Engineering. Edinburgh: IEEE, 2004. 563-572.
    [19] Agrawal R, Srikant R. Fast algorithms for mining association rules in large databases. In: Proc. of the 20th Int’l Conf. on Very Large Data Bases. Santiago de Chile: Morgan Kaufmann, 1994. 487-499.
    [20] van Rysselberghe F, Rieger M, Demeyer S. Detecting move operations in versioning information. In: Proc. of the 2006 Conf. on Software Maintenance and Reengineering. Bari: IEEE, 2006. 8-278.
    [21] Zou LJ, Godfrey MW, Hassan AE. Detecting interaction coupling from task interaction histories. In: Proc. of the 15th IEEE Int’l Conf. on Program Comprehension. Banff: IEEE, 2007. 135-144.
    [22] D'Ambros M, Lanza M, Lungu M. Visualizing Co-change information with the evolution radar. IEEE Transactions on Software Engineering, 2009, 35(5): 720-735. [doi: 10.1109/TSE.2009.17]
    [23] Ali N, Jaafar F, Hassan AE. Leveraging historical co-change information for requirements traceability. In: Proc. of the 20th Working Conf. on Reverse Engineering. Koblenz: IEEE, 2013. 361-370.
    [24] Mondal M, Roy CK, Schneider KA. Insight into a method co-change pattern to identify highly coupled methods: An empirical study. In: Proc. of the 21st Int’l Conf. on Program Comprehension. San Francisco: IEEE, 2013. 103-112.
    [25] Mondal M, Roy CK, Schneider KA. Improving the detection accuracy of evolutionary coupling by measuring change correspondence. In: Proc. of the 2014 Software Evolution Week-IEEE Conf. on Software Maintenance, Reengineering, and Reverse Engineering. Antwerp: IEEE, 2014. 358-362.
    [26] Lozano A, Noguera C, Jonckers V. Explaining why methods change together. In: Proc. of the 14th IEEE Int’l Working Conf. on Source Code Analysis and Manipulation. Victoria: IEEE, 2014. 185-194.
    [27] Mo R, Zhan MY. History coupling space: A new model to represent evolutionary relations. In: Proc. of the 26th Asia-Pacific Software Engineering Conf. Putrajaya: IEEE, 2019. 126-133.
    [28] Mondal M, Roy B, Roy CK, Schneider KA. Associating code clones with association rules for change impact analysis. In: Proc. of the 27th IEEE Int’l Conf. on Software Analysis, Evolution and Reengineering. London: IEEE, 2020. 93-103.
    [29] Mondal M, Roy CK, Roy B, Schneider KA. FLeCCS: A technique for suggesting fragment-level similar co-change candidates. In: Proc. of the 29th IEEE/ACM Int’l Conf. on Program Comprehension. Madrid: IEEE, 2021. 160-171.
    [30] Johnson JH. Substring matching for clone detection and change tracking. In: Proc. of the 1994 Int’l Conf. on Software Maintenance. Victoria: IEEE, 1994. 120-126.
    [31] Roy CK, Cordy JR. NICAD: Accurate detection of near-miss intentional clones using flexible pretty-printing and code normalization. In: Proc. of the 16th IEEE Int’l Conf. on Program Comprehension. Amsterdam: IEEE, 2008. 172-181.
    [32] Wang PC, Svajlenko J, Wu YZ, Xu Y, Roy CK. CCAligner: A token based large-gap clone detector. In: Proc. of the 40th Int’l Conf. on Software Engineering. Gothenburg: ACM, 2018. 1066-1077.
    [33] Krinke J. Identifying similar code with program dependence graphs. In: Proc. of the 8th Working Conf. on Reverse Engineering. Stuttgart: IEEE, 2001. 301-309.
    [34] Perumal A, Kanmani S, Kodhai E. Extracting the similarity in detected software clones using metrics. In: Proc. of the 2010 Int’l Conf. on Computer and Communication Technology. Allahabad: IEEE, 2010. 575-579.
    [35] Nguyen HA, Nguyen AT, Nguyen TT, Nguyen TN, Rajan H. A study of repetitiveness of code changes in software evolution. In: Proc. of the 28th IEEE/ACM Int’l Conf. on Automated Software Engineering. Silicon Valley: IEEE, 2013. 180-190.
    [36] Nadi S, Holt R, Mankovskii S. Does the past say it all? Using history to predict change sets in a CMDB. In: Proc. of the 14th European Conf. on Software Maintenance and Reengineering. Madrid: IEEE, 2010. 97-106.
    [37] Mondal M, Roy CK, Schneider KA. An empirical study on ranking change recommendations retrieved using code similarity. In: Proc. of the 23rd IEEE Int’l Conf. on Software Analysis, Evolution, and Reengineering. Osaka: IEEE, 2016. 44-50.
    [38] Kreutzer P, Dotzler G, Ring M, Eskofier BM, Philippsen M. Automatic clustering of code changes. In: Proc. of the 13th Int’l Conf. on Mining Software Repositories. Austin: ACM, 2016. 61-72.
    [39] Meng N, Kim M, McKinley KS. LASE: Locating and applying systematic edits by learning from examples. In: Proc. of the 35th Int’l Conf. on Software Engineering. San Francisco: IEEE, 2013. 502-511.
    [40] Nguyen HA, Nguyen TN, Dig D, Nguyen S, Tran H, Hilton M. Graph-based mining of in-the-wild, fine-grained, semantic code change patterns. In: Proc. of the 41st IEEE/ACM Int’l Conf. on Software Engineering. Montreal: IEEE, 2019. 819-830.
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

王敏,潘兴禄,邹艳珍,谢冰.面向代码审查的细粒度代码变更溯源方法.软件学报,2023,34(10):4705-4723

复制
分享
文章指标
  • 点击次数:752
  • 下载次数: 2433
  • HTML阅读次数: 969
  • 引用次数: 0
历史
  • 收稿日期:2021-09-28
  • 最后修改日期:2022-01-28
  • 在线发布日期: 2023-04-04
  • 出版日期: 2023-10-06
文章二维码
您是第19765647位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号