方法级别的细粒度软件缺陷定位方法
作者:
作者简介:

张文(1981-),男,湖北洪湖人,博士,教授,博士生导师,CCF专业会员,主要研究领域为软件工程,数据挖掘;李自强(1994-),男,硕士生,主要研究领域为软件工程,数据挖掘;杜宇航(1994-),男,硕士生,主要研究领域为软件工程,数据挖掘;杨叶(1977-),女,博士,教授,博士生导师,主要研究领域为软件工程.

通讯作者:

张文,E-mail:zhangwen@mail.buct.edu.cn

基金项目:

国家自然科学基金(61379046,61432001);西安市科技计划(2016CXWL21)


Fine-grained Software Bug Location Approach at Method Level
Author:
Fund Project:

National Natural Science Foundation of China (61379046, 61432001); Science and Technology Project of Xi'an Municipality (2016CXWL21)

  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [39]
  • |
  • 相似文献 [20]
  • | | |
  • 文章评论
    摘要:

    当软件缺陷报告在跟踪系统中被指派给开发人员进行缺陷修复之后,缺陷修复人员就需要根据提交的缺陷报告来进行软件缺陷定位,并做出相应的代码变更,以修复该软件缺陷.在缺陷修复的整个过程中,软件缺陷定位占用了开发人员大量的时间.提出了一种方法级别的细粒度软件缺陷定位方法MethodLocator,以提高软件修复人员的工作效率.MethodLocator首先对缺陷报告和源代码方法体利用词向量(word2vec)和TF-IDF结合的方法进行向量表示;然后,根据源代码文件中方法体之间的相似度对方法体进行扩充;最后,通过对扩充后的方法体和缺陷报告计算其余弦距离并排序,来定位为修复软件缺陷所需做出变更的方法.在4个开源软件项目ArgoUML、Ant、Maven和Kylin上的实验结果表明,MethodLocator方法优于现有的缺陷定位方法,它能够有效地将软件缺陷定位到源代码的方法级别上.

    Abstract:

    When a software bug report is assigned to a developer for bug resolution, the developer needs to locate the bug in a source code file and make code changes correspondingly to resolve the software bug. In fact, most of time of the developer is spent on bug location in the whole process of bug resolution. This study proposes a method level fine-grained bug location approach, called MethodLocator, to improve the efficiency of software bug resolution. Firstly, it takes the vector representation of the bug report and the source code method body using the word vector (Word2Vec) and TF-IDF. Secondly, MethodLocator augments method body of each method based on similarities among all method bodies in the source code files. Thirdly, MethodLocator locates methods for change to resolve the bug based on similarities between the bug report and the augmented methods. Experimental results on four open source software projects as ArgoUML, Ant, Maven, and Kylin demonstrate that MethodLocator is better than state-of-the-art techniques in method level bug location.

    参考文献
    [1] Zhao Y, Leung H, Yang Y, Zhou Y, Xu B. Towards an understanding of change types in bug fixing code. Information & Software Technology, 2017,86:37-53.
    [2] Wu W, Zhang W, Yang Y, Wang Q. DREX:Developer recommendation with K-nearest-neighbor search and expertise ranking. In:Proc. of the Asia Pacific Software Engineering Conf. Ho CHI Minh, DBLP, 2011. 389-396.
    [3] Zhang W, Wang S, Wang Q. BAHA:A novel approach to automatic bug report assignment with topic modeling and heterogeneous network analysis. Chinese Journal of Electronics, 2016,25(6):1011-1018.
    [4] Jeong G, Kim S, Zimmermann T. Improving bug triage with bug tossing graphs. In:Proc. of the Joint Meeting of the European Software Engineering Conf. and the ACM Sigsoft Symp. on the Foundations of Software Engineering. ACM Press, 2009. 111-120.
    [5] Lukins SK, Kraft NA, Etzkorn LH. Bug localization using latent dirichlet allocation. Information & Software Technology, 2010, 52(9):972-990.
    [6] Youm KC, Ahn J, Kim J, Lee E. Bug localization based on code change histories and bug reports. In:Proc. of the Asia-Pacific Software Engineering Conf. 2015. 190-197.
    [7] Naish L, Hua JL, Ramamohanarao K. A model for spectra-based software diagnosis. ACM Trans. on Software Engineering & Methodology, 2011,20(3):1-32.
    [8] Wang X, Zhang W, Wang Q. Two-phase bug localization method based on defect repair history. Computer Systems & Applications, 2014,23(11):99-104(in Chinese with English abstract).
    [9] Tang M, Zhu L, Zou XC. Document vector representation based on Word2Vec. Computer Science, 2016,43(6):214-217(in Chinese with English abstract).
    [10] Poshyvanyk D, Gueheneuc YG, Marcus A, Antoniol G, Rajlich V. Feature location using probabilistic ranking of methods based on execution scenarios and information retrieval. IEEE Trans. on Software Engineering, 2007,33(6):420-432.
    [11] Zhou J, Zhang H, Lo D. Where should the bugs be fixed? More accurate information retrieval-based bug localization based on bug reports. In:Proc. of the ICSE 2012. 2012. 14-24.
    [12] Moreno L, Treadway JJ, Marcus A, Shen W. On the use of stack traces to improve text retrieval-based bug localization. In:Proc. of the IEEE Int'l Conf. on Software Maintenance and Evolution. IEEE, 2014. 151-160.
    [13] Saha RK, Lease M, Khurshid S, Perry DE. Improving bug localization using structured information retrieval. In:Proc. of the IEEE/ACM Int'l Conf. on Automated Software Engineering. ACM Press, 2015. 345-355.
    [14] Saha RK, Lawall J, Khurshid S, Perry DE. On the effectiveness of information retrieval based bug localization for C programs. In:Proc. of the IEEE Int'l Conf. on Software Maintenance and Evolution. IEEE, 2014. 161-170.
    [15] Wang S, Lo D. Version history, similar report, and structure:Putting them together for improved bug localization. In:Proc. of the Int'l Conf. on Program Comprehension. ACM Press, 2014. 53-63.
    [16] Rahman F, Posnett D, Hindle A, Barr E, Devanbu P. BugCache for inspections:Hit or miss? In:Proc. of the ACM Sigsoft Symp. on the Foundations of Software Engineering (SIGSOFTSoft/FSE 2011). DBLP, 2011. 322-331.
    [17] Le TDB, Oentaryo RJ, Lo D. Information retrieval and spectrum based bug localization:Better together. In:Proc. of the Joint Meeting on Foundations of Software Engineering. ACM Press, 2015. 579-590.
    [18] Wong CP, Xiong Y, Zhang H, Hao D, Zhang L, Mei H. Boosting bug-report-oriented fault localization with segmentation and stack-trace analysis. In:Proc. of the IEEE Int'l Conf. on Software Maintenance and Evolution. IEEE Computer Society, 2014. 181-190.
    [19] Ye X, Bunescu R, Liu C. Learning to rank relevant files for bug reports using domain knowledge. In:Proc. of the ACM Sigsoft Int'l Symp. on Foundations of Software Engineering. ACM Press, 2014. 689-699.
    [20] Ye X, Shen H, Ma X, Bunescu R, Liu C. From word embeddings to document similarities for improved information retrieval in software engineering. In:Proc. of the IEEE/ACM Int'l Conf. on Software Engineering. IEEE, 2016. 404-415.
    [21] Giger E, D'Ambros M, Pinzger M, Gall HC. Method-level bug prediction. In:Proc. of the ESEM 2012. 2012. 171-180.
    [22] Yuan Z, Yu LL, Liu C. Bug prediction method for fine-grained source code changes. Ruan Jian Xue Bao/Journal of Software, 2014, 25(11):2499-2517(in Chinese with English abstract). http://www.jos.org.cn/1000-9825/4559.htm[doi:10.13328/j.cnki.jos.004559]
    [23] Hata H, Mizuno O, Kikuno T. Bug prediction based on fine-grained module histories. In:Proc. of the Int'l Conf. on Software Engineering. IEEE, 2012. 200-210.
    [24] Wen M, Wu R, Cheung SC. Locus:Locating bugs from software changes. In:Proc. of the IEEE/ACM Int'l Conf. on Automated Software Engineering. IEEE, 2016. 262-273.
    [25] Youm KC, Ahn J, Lee E. Improved bug localization based on code change histories and bug reports. Information & Software Technology, 2016,82:177-192.
    [26] Phan XH, Nguyen LM, Horiguchi S. Learning to classify short and sparse text & Web with hidden topics from large-scale data collections. In:Proc. of the WWW 2008. 2008. 91-100.
    [27] Quan X, Liu G, Lu Z, Ni X, Liu W. Short text similarity based on probabilistic topics. Knowledge and Information Systems, 2010, 25(3):473-491.
    [28] Chen M, Jin X, Shen D. Short text classification improved by learning multi-granularity topics. In:Proc. of the Int'l Joint Conf. on Artificial Intelligence (IJCAI 2011). Barcelona, 2011. 1776-1781.
    [29] Ma HF, Zeng XT, Li XH, Zhu ZQ. Short text feature extension method of improved frequent term set. Computer Engineering, 2016,42(10):213-218(in Chinese with English abstract).
    [30] Mikolov T, Sutskever I, Chen K, Corrado G, Dean J. Distributed representations of words and phrases and their compositionality. In:Proc. of Int'l Conf. on 27th Conf. on Neural Information Processing Systems (NIPS 2013). Lake Tahoe, 3111-3119.
    [31] Joachims T. A probabilistic analysis of the rocchio algorithm with TFIDF for text categorization. In:Proc. of the 14th Int'l Conf. on Machine Learning. Morgan Kaufmann Publishers Inc., 1997. 143-151.
    [32] Mikolov T, Chen K, Corrado G, Dean J. Efficient estimation of word representations in vector space. In:Proc. of Workshop at 2013 Int'l Conf. on Learning Representation. 2013.
    [33] Boureau YL, Bach F, Lecun Y, Ponce J. Learning mid-level features for recognition. In:Proc. of the Computer Vision and Pattern Recognition. IEEE, 2010. 2559-2566.
    [34] Zimmermann T, Zeller A. When do changes induce fixes? In:Proc. of the Int'l Workshop on Mining Software Repositories. ACM Press, 2005. 1-5.
    附中文参考文献:
    [8] 王旭,张文,王青.基于缺陷修复历史的两阶段缺陷定位方法.计算机系统应用,2014,23(11):99-104.
    [9] 唐明,朱磊,邹显春.基于Word2Vec的一种文档向量表示.计算机科学,2016,43(6):214-217.
    [22] 原子,于莉莉,刘超.面向细粒度源代码变更的缺陷预测方法.软件学报,2014,25(11):2499-2517. http://www.jos.org.cn/1000-9825/4559.htm[doi:10.13328/j.cnki.jos.004559]
    [29] 马慧芳,曾宪桃,李晓红,朱志强.改进的频繁词集短文本特征扩展方法.计算机工程,2016,42(10):213-218.
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

张文,李自强,杜宇航,杨叶.方法级别的细粒度软件缺陷定位方法.软件学报,2019,30(2):195-210

复制
分享
文章指标
  • 点击次数:4153
  • 下载次数: 5158
  • HTML阅读次数: 1520
  • 引用次数: 0
历史
  • 收稿日期:2017-09-06
  • 最后修改日期:2017-10-31
  • 在线发布日期: 2018-03-14
文章二维码
您是第19788010位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号