一种基于深度学习的上帝类检测方法
作者:
作者简介:

卜依凡(1995-),女,山西侯马人,硕士生,主要研究领域为软件重构;李光杰(1980-),女,博士生,主要研究领域为软件重构,软件质量;刘辉(1978-),男,博士,教授,博士生导师,CCF高级会员,主要研究领域为软件重构,软件质量,软件测试,智能化软件开发环境.

通讯作者:

刘辉,E-mail:liuhui08@bit.edu.cn

基金项目:

国家重点研发计划(2016YFB1000801);国家自然科学基金(61690205,61772071,61472034)


God Class Detection Approach Based on Deep Learning
Author:
Fund Project:

National Key Research and Development Program of China (2016YFB1000801); National Natural Science Foundation of China (61690205, 61772071, 61472034)

  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [40]
  • |
  • 相似文献 [20]
  • | | |
  • 文章评论
    摘要:

    上帝类是指某个承担了本应由多个类分别承担的多个职责的类.上帝类违背了分而治之的基本思想以及单一职责的设计原则,严重影响软件的可维护性和可理解性.但上帝类又是一种比较常见的代码坏味.因此,针对上帝类的检测与重构一直是代码重构领域的研究热点之一.为此,提出了一种基于深度神经网络的上帝类检测方法.该方法不仅利用了常见的软件度量,而且充分利用了代码中的文本信息,意图通过挖掘文本语义揭示每个类所承担的主要角色.此外,为了解决有监督深度学习所需的海量标签数据,提出了一种基于开源代码构造标签数据的方法.最后,基于开源数据集对所提出的方法进行了实验验证.实验结果表明,这些方法优于现有的上帝类检测方法,尤其是在查全率上有大幅度的提升(提高了35.58%).

    Abstract:

    God class refers to certain classes that have assumed more than one functionality, which obey the single responsibility principle and consequently impact on the maintainability and intelligibility of software system. Studies, detection and refactoring included, of god class have always attracted research attentions because of its commonness. As a result, a neural network based detection approach is proposed to detect god class code smell. This detection technology not only makes use of common metrics in software, but also exploits the textual information in source code, which is intended to reveal the main roles that the class plays through mining text semantics. In addition, in order to solve the massive labeled data required for supervised deep learning, an approach is proposed to construct labeled data based on open source code. Finally, the proposed approach is evaluated on an open source data set. The result of evaluation shows that the proposed approach outperforms the current method, especially the recall has been greatly improved by 35.58%.

    参考文献
    [1] Opdyke WF. Refactoring object-oriented frameworks[Ph.D. Thesis]. Urbana:University of Illinois at Urbana-Champaign, 1992.
    [2] Mens T, Tourwe T. A survey of software refactoring. IEEE Trans. on Software Engineering, 2004,30(2):126-139.
    [3] Liu H, Li GJ. Research on Software Refactoring. Beijing:Beijing Institute of Technology Press, 2016(in Chinese).
    [4] Fowler M, Beck K, Wrote; Hou J, Xiong J, Trans. Refactoring:Improving the Design of Existing Code. 2nd ed. Beijing:Posts and Telecom Press, 2015(in Chinese).
    [5] Fontana FA, Braione P, Zanoni M. Automatic detection of bad smells in code:An experimental assessment. The Journal of Object Technology, 2012,11(2):1-38.
    [6] Marinescu C, Marinescu R, Mihancea PF, Ratiu D, Wettel R. iPlasma:An integrated platform for quality assessment of object-oriented design. In:Proc. of the 21st IEEE Int'l Conf. on Software Maintenance-industrial and Tool Volume. Budapest, 2005. 77-80.
    [7] Moha N, Gueheneuc Y, Duchien L, Le Meur AL. DECOR:A method for the specification and detection of code and design smells. IEEE Trans. on Pattern Analysis and Machine Intelligence, 2010,36(1):20-36.
    [8] Fokaefs M, Tsantalis N, Chatzigeorgiou A. JDeodorant:Identification and removal of feature envy bad smells. In:Proc. of the 23rd IEEE Int'l Conf. on Software Maintenance. Paris:IEEE, 2007. 519-520.
    [9] Lanza M, Marinescu R, Ducasse S. Object-oriented Metrics in Practice:Using Software Metrics to Characterize, Evaluate, and Improve the Design of Object-oriented Systems. Berlin, Heidelberg:Springer-Verlag, 2006.
    [10] Fokaefs M, Tsantalis N, Stroulia E, Chatzigeorgiou A. JDeodorant:Identification and application of extract class refactorings. In:Taylor RN, Gall HC, Medvidovic N, eds. Proc. of the 33rd Int'l Conf. on Software Engineering. Waikiki:ACM Press, 2011. 1037-1039.
    [11] Jiang DX, Ma PJ, Su XH, et al. Related work analysis of code bad smell detection and refactoring. Intelligent Computer and Applications, 2014,4(3):23-27(in Chinese with English abstract).
    [12] Zhang M, Hall T, Baddoo N. Code bad smells:A review of current knowledge. Journal of Software Maintenance and Evolution:Research and Practice, 2011,23(3):179-202.
    [13] Dallal JA. Identifying refactoring opportunities in object-oriented code:A systematic literature review. Information & Software Technology, 2015,58(58):231-249.
    [14] Nucci DD, Palomba F, Tamburri DA, et al. Detecting code smells using machine learning techniques:Are we there yet. In:Oliveto R, Di Penta M, Shepherd DC, eds. Proc. of the 25th Int'l Conf. on Software Analysis Evolution and Reengineering. Campobasso:IEEE Computer Society, 2018. 612-621.
    [15] Brown WH, Malveau RC, McCormick Iii HW, Mowbray TJ. AntiPatterns:Refactoring Software, Architectures, and Projects in Crisis. New York:John Wiley & Sons, Inc., 1998.
    [16] Tsantalis N, Chaikalis T, Chatzigeorgiou A. JDeodorant:Identification and removal of type-checking bad smells. In:Proc. of the 12th European Conf. on Software Maintenance and Reengineering. Athens:IEEE Computer Society, 2008. 329-331.
    [17] Tsantalis N, Chatzigeorgiou A. Identification of extract method refactoring opportunities. In:Winter A, Ferenc R, Knodel J, eds. Proc. of the 13th European Conf. on Software Maintenance and Reengineering. Kaiserslautern:IEEE Computer Society, 2009. 119-128.
    [18] Zhong LH, Zhang NW, Hou CY, et al. Improved software refactoring method based on hierarchical clustering algorithm. Computer Engineering and Applications, 2015,51(20):50-54(in Chinese with English abstract).
    [19] Kreimer J. Adaptive detection of design flaws. Electronic Notes in Theoretical Computer Science, 2005,141(4):117-136.
    [20] Khomh F, Vaucher S, Gueheneuc Y, Sahraoui HA. BDTEX:A GQM-based Bayesian approach for the detection of antipatterns. Journal of Systems and Software, 2011,84(4):559-572.
    [21] Maiga A, Ali N, Bhattacharya N, Sabane A, Gueheneuc Y, Aimeur E. SMURF:A SVM-based incremental anti-pattern detection approach. In:Proc. of the 19th Working Conf. on Reverse Engineering. Kingston:IEEE Computer Society, 2012. 466-475.
    [22] Palomba F, Panichella A, De Lucia A, Oliveto R, Zaidman A. A textual-based technique for smell detection. In:Proc. of the 24th IEEE Int'l Conf. on Program Comprehension. Austin:IEEE Computer Society, 2016. 1-10.
    [23] Ma S, Dong D. Detection of large class based on latent semantic analysis. Computer Science, 2017,44(s1):495-498(in Chinese with English abstract).
    [24] Fontana FA, Zanoni M, Marino A, Mantyla MV. Code smell detection:Towards a machine learning-based approach. In:Proc. of the 2013 IEEE Int'l Conf. on Software Maintenance. Eindhoven:IEEE Computer Society, 2013. 396-399.
    [25] Arnaoudova V, Eshkevari LM, Penta MD, Oliveto R, Antoniol G, Gueheneuc Y. REPENT:Analyzing the nature of identifier renamings. IEEE Trans. on Software Engineering, 2014,40(5):502-532.
    [26] Mikolov T, Sutskever I, Chen K, Corrado, GS, Dean J. Distributed representations of words and phrases and their compositionality. In:Proc. of the Neural Information Processing Systems. 2013. 3111-3119.
    [27] Mikolov T, Chen K, Corrado GS, Dean J. Efficient estimation of word representations in vector space. In:Proc. of the Computation and Language. 2013.
    [28] Allamanis M, Barr ET, Bird C, Sutton CA. Suggesting accurate method and class names. In:Di Nitto E, Harman M, Heymans P, eds. Proc. of the 201510th Joint Meeting on Foundations of Software Engineering. Bergamo:ACM Press, 2015. 38-49.
    [29] Hochreiter S. The vanishing gradient problem during learning recurrent neural nets and problem solutions. Int'l Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 1998,6(2):107-116.
    [30] Palomba F, Bavota G, Penta MD, Fasano F, Oliveto R, De Lucia A. On the diffuseness and the impact on maintainability of code smells:A large scale empirical investigation. Empirical Software Engineering, 2018,23(3):1188-1221.
    [31] Khomh F, Penta MD, Gueheneuc Y, Antoniol G. An exploratory study of the impact of antipatterns on class change-and fault-proneness. Empirical Software Engineering, 2012,17(3):243-275.
    [32] Reed R. Pruning algorithms-A survey. IEEE Trans. on Neural Networks, 1993,4(5):740-747.
    [33] Girosi F, Jones M, Poggio T. Regularization theory and neural networks architectures. Neural Computation, 1995,7(2):219-269.
    [34] Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R. Dropout:A simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research, 2014,15(1):1929-1958.
    [35] Sjoberg J, Ljung L. Overtraining, regularization, and searching for minimum in neural networks. In:Proc. 4th IFAC Symp. on Adaptive Systems in Control and Signal Processing. Grenoble:Elsevier, 1992. 73-78.
    附中文参考文献:
    [3] 刘辉,李光杰.软件重构技术研究.北京:北京理工大学出版社,2016.
    [11] 姜德迅,马培军,苏小红,等.代码坏味检测及重构的现状分析.智能计算机与应用,2014,4(3):23-27.
    [18] 钟林辉,张能伟,侯长源,等.一种改进的基于层次聚类的软件重构技术研究.计算机工程与应用,2015,51(20):50-54.
    [23] 马赛,董东.基于潜在语义分析的Large Class检测.计算机科学,2017,44(s1):495-498.
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

卜依凡,刘辉,李光杰.一种基于深度学习的上帝类检测方法.软件学报,2019,30(5):1359-1374

复制
分享
文章指标
  • 点击次数:3588
  • 下载次数: 6835
  • HTML阅读次数: 3634
  • 引用次数: 0
历史
  • 收稿日期:2018-08-31
  • 最后修改日期:2018-10-31
  • 在线发布日期: 2019-05-08
文章二维码
您是第19862283位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号