God Class Detection Approach Based on Graph Model and Isolation Forest
Author:
Affiliation:

Clc Number:

TP311

Fund Project:

National Key Research and Development Program of China (2016YFB1000801)

  • Article
  • | |
  • Metrics
  • |
  • Reference [44]
  • |
  • Related [20]
  • | | |
  • Comments
    Abstract:

    God class refers to a class that carries heavy tasks and responsibilities. The common feature of God class is that it contains a large number of attributes and methods, and has multiple dependencies with other classes in the system. God class is a typical code smell, which has a negative impact on the development and maintenance of the software. In recent years, many studies have been devoted to discovering or refactoring the God class; however, the detection ability of existing methods is not strong, and the detection precision is not high enough. This study proposes a God class detection approach based on graph model and isolation forest algorithm, which can be divided into two stages:The stage of the graph structure information analysis and the stage of intra-class measurement evaluation. In the stage of the graph structure information analysis, inter-class method call graphs and intra-class structure graphs are established, respectively. The isolation forest algorithm is used to reduce the detection range of God class. In the stage of the intra-class measurement evaluation, the impact of the scale and architecture of the project is taken into account, and the average value of the God class related measurement indicators in the project is used as the benchmark. An experiment is designed to determine the scale factors, and the product of the average value and the scale factors are used as the threshold for the detection to obtain the God class detection result. The experimental results on the code smell benchmark data set show that the method proposed in this article improves the precision and F1 value by 25.8 percentage points and 33.39 percentage points, respectively, compared to an existing God class detection method, with a high recall at the same time.

    Reference
    [1] Fowler M, Beck K. Refactoring:Improving the Design of Existing Code. 2nd ed., Addison-Wesley Professional, 2018.
    [2] Nucci D, Palomba F, Tamburri D, Serebrenik A, Lucia A. Detecting code smells using machine learning techniques:Are we there yet? In:Proc. of the 25th Int'l Conf. on Software Analysis, Evolution and Reengineering. 2018. 612-621.
    [3] Deligiannis I, Stamelos I, Angelis L, Roumeliotis M, Shepperd M. A controlled experiment investigation of an object-oriented design heuristic for maintainability. Journal of Systems and Software, 2004, 72(2):129-143.
    [4] Xu WB, Hua QB, Fei N. Object-oriented software refactoring. Computer Engineering, 2005, 31(5):82-84(in Chinese with English abstract).
    [5] Tsantalis N, Chatzigeorgiou A. Identification of move method refactoring opportunities. IEEE Trans. on Software Engineering, 2009, 35(3):347-367.
    [6] Vaucher S, Khomh F, Moha N, et al. Tracking design smells:Lessons from a study of god classes. In:Proc. of the 16th Working Conf. on Reverse Engineering. 2009. 145-154.
    [7] Azadi U, Fontana FA, Taibi D. Architectural smells detected by tools:A catalogue proposal. In:Proc. of the 2nd Int'l Conf. on Technical Debt. 2019. 88-97.
    [8] Marinescu R. Detection strategies:Metrics-based rules for detecting design flaws. In:Proc. of the 20th Int'l Conf. on Software Maintenance. 2004. 350-359.
    [9] Munro M. Product metrics for automatic identification of "bad smell" design problems in Java source-code. In:Proc. of the 11th IEEE Int'l Software Metrics Symp. 2005. Article No.15.
    [10] Moha N, Gueheneuc Y, Duchien L, Le Meur AL. DECOR:A method for the specification and detection of code and design smells. IEEE Trans. on Pattern Analysis and Machine Intelligence, 2010, 36(1):20-36.
    [11] Palomba F, Bavota G, Penta MD, et al. Mining version histories for detecting code smells. IEEE Trans. on Software Engineering, 2015, 41(5):462-489.
    [12] Bu YF, Liu H, Li GJ. God class detection approach based on deep learning. Ruan Jian Xue Bao/Journal of Software, 2019, 30(5):1359-1374(in Chinese with English abstract). http://www.jos.org.cn/1000-9825/5724.htm[doi:10.13328/j.cnki.jos.005724]
    [13] Page L, Brin S, Motwani R, Winograd T. The Page Rank Citation Ranking:Bringing Order to the Web. Stanford Digital Libraries Working Paper, 1998.
    [14] Smith N, Bruggen DV, Tomassetti F. JavaParser:Visited analyse, transform and generate your Java code base. 2017. https://enterprise.leanpub.com/javaparservisited
    [15] Liu FT, Ting KM, Zhou ZH. Isolation forest. In:Proc. of the 8th Int'l Conf. on Data Mining. 2008. 413-422.
    [16] Liu FT, Ting KM, Zhou ZH. Isolation-based anomaly detection. ACM Trans. on Knowledge Discovery from Data, 2012, 6(1):1-39.
    [17] Blondel V, Guillaume J, Lambiotte R, Lefebvre E. Fast unfolding of communities in large networks. Journal of Statistical Mechanics:Theory and Experiment. 2008, 2008(10):Article No.10008.
    [18] Abbes M, Khomh F, Gueheneuc YG, et al. An empirical study of the impact of two antipatterns, blob and spaghetti code, on program comprehension. In:Proc. of the European Conf. on Software Maintenance & Reengineering. 2011. 181-190.
    [19] Yamashita A, Moonen L. Exploring the impact of inter-smell relations on software maintainability:An empirical study. In:Proc. of the 35th Int'l Conf. on Software Engineering. 2013. 682-691.
    [20] Zhang M, Hall T, Baddoo N. Code bad smells:A review of current knowledge. Journal of Software Maintenance & Evolution Research & Practice, 2011, 23(3):179-202.
    [21] Tufano M, Palomba F, Bavota G, et al. When and why your code starts to smell bad (and whether the smells go away). IEEE Trans. on Software Engineering, 2017, 43(11):1063-1088.
    [22] Tufano M, Palomba F, Bavota G, et al. An empirical investigation into the nature of test smells. In:Proc. of the IEEE/ACM Int'l Conf. on Automated Software Engineering. 2016. 4-15.
    [23] Zhang XF, Zhu C. Empirical study of code smell impact on software evolution. Ruan Jian Xue Bao/Journal of Software, 2019, 30(5):1422-1437(in Chinese with English abstract). http://www.jos.org.cn/1000-9825/5735.htm[doi:10.13328/j.cnki.jos.005735]
    [24] Arcoverde R, Garcia A, Figueiredo E. Understanding the longevity of code smells:Preliminary results of an explanatory survey. In:Proc. of the 4th Workshop on Refactoring Tools. 2011. 33-36.
    [25] Chatzigeorgiou A, Manakos A. Investigating the evolution of bad smells in object-oriented code. In:Proc. of the 7th Int'l Conf. on the Quality of Information and Communications Technology. 2010. 106-115.
    [26] Peters R, Zaidman A. Evaluating the lifespan of code smells using software repository mining. In:Proc. of the 16th European Conf. on Software Maintenance and Reengineering. 2012. 411-416.
    [27] Fokaefs M, Tsantalis N, Chatzigeorgiou A. JDeodorant:Identification and removal of feature envy bad smells. In:Proc. of the 23rd IEEE Int'l Conf. on Software Maintenance. 2007. 519-520.
    [28] Fontana FA, Mäntylä MV, Zanoni M, et al. Comparing and experimenting machine learning techniques for code smell detection. Empirical Software Engineering, 2016, 21(3):1143-1191.
    [29] Maiga A, Ali N, Bhattacharya N, et al. SMURF:A SVM-based incremental anti-pattern detection approach. In:Proc. of the 19th Working Conf. on Reverse Engineering. 2012. 466-475.
    [30] Maiga A, Ali N, Bhattacharya N, Sababe A, Guhneuc Y, Antoniol E, Ameur E. Support vector machines for anti-pattern detection. In:Proc. of the Int'l Conf. on Automated Software Engineering. 2012. 278-281.
    [31] Khomh F, Vaucher S, Gueheneuc Y, Sahraoui H. A Bayesian approach for the detection of code and design smells. In:Proc. of the 9th Int'l Conf. on Quality Software. 2009. 305-314.
    [32] Khomh F, Vaucher S, Gueheneuc Y, et al. BDTEX:A GQM-based Bayesian approach for the detection of antipatterns. Journal of Systems & Software, 2011, 84(4):559-572.
    [33] Palomba F, Panichella A, Lucia AD, et al. A textual-based technique for Smell Detection. In:Proc. of the 24th IEEE Int'l Conf. on Program Comprehension. 2016. 1-10.
    [34] Ma S, Dong D. Detection of large class based on latent semantic analysis. Computer Science, 2017, 44(S1):495-498(in Chinese with English abstract).
    [35] Wang SY, Zhang YQ, Sun JZ. Detection of bad smell in code based on BP neural network. Computer Engineering, 2020, 46(10):216-222, 230(in Chinese with English abstract).
    [36] Zhang HR, Wu YJ, Zhao WY. Software metrics set for code design quality monitoring. Computer Applications and Software, 2020, 37(3):13-21, 66(in Chinese with English abstract).
    [37] Palomba F, Bavota G, Penta MD, Fasano F, Oliveto R, De Lucia A. On the diffuseness and the impact on maintainability of code smells:A large scale empirical investigation. Empirical Software Engineering, 2018, 23(3):1188-1221.
    附中文参考文献:
    [4] 许文波, 华奇兵, 费娜. 面向对象的软件重构. 计算机工程, 2005, 31(5):82-84.
    [12] 卜依凡, 刘辉,李光杰. 一种基于深度学习的上帝类检测方法. 软件学报, 2019, 30(5):1359-1374. http://www.jos.org.cn/1000-9825/5724.htm[doi:10.13328/j.cnki.jos.005724]
    [23] 章晓芳, 朱灿.代码坏味对软件演化影响的实证研究. 软件学报, 2019, 30(5):1422-1437. http://www.jos.org.cn/1000-9825/5735.htm[doi:10.13328/j.cnki.jos.005735]
    [34] 马赛, 董东.基于潜在语义分析的Large Class检测. 计算机科学, 2017, 44(S1):495-498.
    [35] 王曙燕, 张一权, 孙家泽. 基于BP神经网络的代码坏味检测. 计算机工程, 2020, 46(10):216-222, 230.
    [36] 张海锐, 吴毅坚, 赵文耘. 面向代码设计质量监控的软件度量指标集研究. 计算机应用与软件, 2020, 37(3):13-21, 66.
    Cited by
    Comments
    Comments
    分享到微博
    Submit
Get Citation

刘弋,吴毅坚,彭鑫,闫亚东.基于图模型和孤立森林的上帝类检测方法.软件学报,2022,33(11):4046-4060

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:November 19,2020
  • Revised:January 29,2021
  • Online: May 21,2021
  • Published: November 06,2022
You are the first2035231Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063