面向众包软件开发的错误定位方法
作者:
作者简介:

李乐平(1993-),男,博士,CCF学生会员,主要研究领域为软件缺陷定位与修复;张宇霞(1992-),女,博士,助理教授,CCF专业会员,主要研究领域为软件仓库挖掘,开源软件生态系统;刘辉(1978-),男,博士,教授,博士生导师,CCF杰出会员,主要研究领域为软件重构,软件演化与维护,软件测试

通讯作者:

张宇霞,yuxiazh@bit.edu.cn

中图分类号:

TP311

基金项目:

国家自然科学基金(61690205,61772071)


Crowdsourcing Software Development Oriented Fault Localization
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [44]
  • |
  • 相似文献 [20]
  • | | |
  • 文章评论
    摘要:

    在软件开发中,错误定位是修复软件缺陷的必要前提.为此,研究者们提出了一系列自动化的错误定位方法.这些方法利用了测试用例运行时的覆盖路径和运行结果等信息,大幅减少了定位错误代码的难度.在竞争性众包软件开发中,往往存在多个竞争性实现(解决方案),提出一种专门面向众包软件工程的错误定位方法.主要思想是,在定位错误语句时,将其多个竞争性实现作为参考程序.针对程序中的各个语句,在参考程序中搜索参考语句,并利用参考语句计算其错误概率.给定一个错误程序和相应的测试用例,首先运行测试用例并使用广泛流行的基于频谱的错误定位方法计算其初始错误概率.然后,根据此语句与其参考语句的相似性调整错误概率.在118个真实的错误程序上进行实验,结果表明所提方法相比基于频谱的方法,定位错误的成本降低了25%以上.

    Abstract:

    Fault localization is an essential precondition for repairing in software development. To this end, researchers have proposed automated fault localization (AFL) methods to facilitate the task. Such approaches have taken full advantage of information such as the execution tracks and execution results of given test cases and receive significant effectiveness in reducing the difficulty of fault localization. In competitive crowdsourcing software development, one task could receive multiple competitive implementations (solutions). This study proposes a novel approach for AFL in crowdsourcing software engineering. The key insight of the proposed approach is that when locating faulty statements in a program, it regards competitive implementations as reference programs. By searching for reference statements in reference programs for each statement in buggy program, it calculates the suspicious score of the statement by leveraging its references. Given a set of test cases and a buggy program, the test scenario is run and the initial suspicious score for each statement in the buggy program is calculated by wildly used SBFL approach. After that, suspicious score of each statement is adapted according to its similarity with statements in competitive implementations. The proposed approach is evaluated on 118 real word buggy programs that are accompanied with competitive implementations. The evaluation results suggest that compared with SBFL approaches, the cost of fault localization is reduced by more than 25%.

    参考文献
    [1] Wong WE, Gao RZ, Li YH, Abreu R, Wotawa F. A survey on software fault localization. IEEE Transactions on Software Engineering, 2016, 42(8): 707–740. [doi: 10.1109/TSE.2016.2521368]
    [2] Vessey I. Expertise in debugging computer programs: A process analysis. International Journal of Man-Machine Studies, 1985, 23(5): 459–494. [doi: 10.1016/S0020-7373(85)80054-7]
    [3] Wong WE, Debroy V, Gao RZ, Li YH. The dstar method for effective software fault localization. IEEE Transactions on Reliability, 2014, 63(1): 290–308. [doi: 10.1109/TR.2013.2285319]
    [4] Jiang JJ, Xiong YF, Zhang HY, Gao Q, Chen XQ. Shaping program repair space with existing patches and similar code. In: Proc. of the 27th ACM SIGSOFT Int’l Symp. on Software Testing and Analysis. Amsterdam: ACM, 2018. 298–309.
    [5] Kim D, Nam J, Song J, Kim S. Automatic patch generation learned from human-written patches. In: Proc. of the 35th Int’l Conf. on Software Engineering (ICSE). San Francisco: IEEE, 2013. 802–811.
    [6] Weimer W, Nguyen T, Goues CL, Forrest S. Automatically finding patches using genetic programming. In: Proc. of the 31st Int’l Conf. on Software Engineering. Vancouver: IEEE, 2009. 364–374.
    [7] Weimer W, Fry ZP, Forrest S. Leveraging program equivalence for adaptive program repair: Models and first results. In: Proc. of the 28th IEEE/ACM Int’l Conf. on Automated Software Engineering (ASE). Silicon Valley: ACM, 2013. 356–366.
    [8] Weiser M. Program slicing. IEEE Transactions on Software Engineering, 1984, SE-10(4): 352–357. [doi: 10.1109/TSE.1984.5010248]
    [9] Agrawal H, Horgan JR. Dynamic program slicing. ACM SIGPLAN Notices, 1990, 25(6): 246–256. [doi: 10.1145/93548.93576]
    [10] Abreu R, Zoeteweij P, van Gemund AJC. On the accuracy of spectrum-based fault localization. In: Proc. of the 2007 Academic and Industrial Conf. Practice and Research Techniques—MUTATION. Windsor: IEEE, 2007. 89–98.
    [11] Chen MY, Kiciman E, Fratkin E, Fox A, Brewer E. Pinpoint: Problem determination in large, dynamic internet services. In: Proc. of the 2002 Int’l Conf. on Dependable Systems and Networks. Washington: IEEE, 2002. 595–604.
    [12] Jones JA, Harrold MJ. Empirical evaluation of the tarantula automatic fault-localization technique. In: Proc. of the 20th IEEE/ACM Int’l Conf. on Automated Software Engineering. Long Beach: ACM, 2005. 273–282.
    [13] Abreu R, Zoeteweij P, van Gemund AJC. Spectrum-based multiple fault localization. In: Proc. of the 2009 IEEE/ACM Int’l Conf. on Automated Software Engineering. Auckland: ACM, 2009. 88–99.
    [14] Liblit B, Naik M, Zheng AX, Aiken A, Jordan MI. Scalable statistical bug isolation. In: Proc. of the 2005 ACM SIGPLAN Conf. on Programming Language Design and Implementation. Chicago: ACM, 2005. 15–26.
    [15] Zeller A, Hildebrandt R. Simplifying and isolating failure-inducing input. IEEE Transactions on Software Engineering, 2002, 28(2): 183–200. [doi: 10.1109/32.988498]
    [16] Li X, Li W, Zhang YQ, Zhang LM. DeepFL: Integrating multiple fault diagnosis dimensions for deep fault localization. In: Proc. of the 28th ACM SIGSOFT Int’l Symp. on Software Testing and Analysis. Beijing: ACM, 2019. 169–180.
    [17] Denmat T, Ducassé M, Ridoux O. Data mining and cross-checking of execution traces: A re-interpretation of Jones, Harrold and Stasko test information. In: Proc. of the 20th IEEE/ACM Int’l Conf. on Automated Software Engineering. Long Beach: ACM, 2005. 396–399.
    [18] Mayer W, Stumptner M. Model-based debugging using multiple abstract models. arXiv:cs/0309030, 2003.
    [19] Zhang XF, Feng Y, Liu D, Chen ZY, Xu BW. Research progress of crowdsourced software testing. Ruan Jian Xue Bao/Journal of Software, 2018, 29(1): 69–88 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/5377.htm 章晓芳, 冯洋, 刘頔, 陈振宇, 徐宝文. 众包软件测试技术研究进展. 软件学报, 2018, 29(1): 69–88. http://www.jos.org.cn/1000-9825/5377.htm
    [20] Estellés-Arolas E, González-Ladrón-De-Guevara F. Towards an integrated crowdsourcing definition. Journal of Information Science, 2012, 38(2): 189–200. [doi:10.1177/0165551512437638]
    [21] Mao K, Capra L, Harman M, Jia Y. A survey of the use of crowdsourcing in software engineering. Journal of Systems and Software, 2017, 126: 57–84.
    [22] TopCoder. 2015. http://www.topcoder.com/
    [23] AppStori. 2015. http://www.appstori.com/
    [24] Mridha SK, Bhattacharyya M. Network based mechanisms for competitive crowdsourcing. In: Proc. of the 2018 ACM India Joint Int’l Conf. on Data Science and Management of Data. Goa: ACM, 2018. 318–321.
    [25] Gazzola L, Micucci D, Mariani L. Automatic software repair: A survey. IEEE Transactions on Software Engineering, 2019, 45(1): 34–67 [doi: 10.1109/TSE.2017.2755013]
    [26] Xuan JF, Martinez M, DeMarco F, Clément M, Marcote SL, Durieux T, Le Berre D, Monperrus M. Nopol: Automatic repair of conditional statement bugs in java programs. IEEE Transactions on Software Engineering, 2017, 43(1): 34–55. [doi: 10.1109/TSE.2016.2560811]
    [27] Debroy V, Wong WE, Xu XF, Choi B. A grouping-based strategy to improve the effectiveness of fault localization techniques. In: Proc. of the 10th Int’l Conf. on Quality Software. Zhangjiajie: IEEE, 2010. 13–22.
    [28] Alves E, Gligoric M, Jagannath V, d’Amorim M. Fault-localization using dynamic slicing and change impact analysis. In: Proc. of the 26th IEEE/ACM Int’l Conf. on Automated Software Engineering (ASE 2011). Lawrence: IEEE, 2011. 520–523.
    [29] Hofer BG, Wotawa F. Spectrum enhanced dynamic slicing for better fault localization. In: Proc. of the 20th European Conf. on Artificial Intelligence. Montpellier: IOS Press, 2012. 420–425.
    [30] Zhang ZY, Chan WK, Tse TH, Hu PF, Wang XM. Is non-parametric hypothesis testing model robust for statistical fault localization? Information and Software Technology, 2009, 51(11): 1573–1585.
    [31] Renieres M, Reiss SP. Fault localization with nearest neighbor queries. In: Proc. of the 18th IEEE Int’l Conf. on Automated Software Engineering. Montreal: IEEE, 2003. 30–39.
    [32] Wong WE, Qi Y. Bp neural network-based effective fault localization. International Journal of Software Engineering and Knowledge Engineering, 2009, 19(4): 573–597. [doi: 10.1142/S021819400900426X]
    [33] Wong WE, Debroy V, Golden R, Xu XF, Thuraisingham B. Effective software fault localization using an RBF neural network. IEEE Transactions on Reliability, 2012, 61(1): 149–169. [doi: 10.1109/TR.2011.2172031]
    [34] Zheng W, Hu DS, Wang J. Fault localization analysis based on deep neural network. Mathematical Problems in Engineering, 2016, 2016: 1820454. [doi: 10.1155/2016/1820454]
    [35] Zhang Z, Lei Y, Tan QP, Mao XG, Zeng P, Chang X. Deep learning-based fault localization with contextual information. IEICE Transactions on Information and Systems, 2017, E100. D(12): 3027–3031. [doi: 10.1587/transinf.2017EDL8143]
    [36] Briand LC, Labiche Y, Liu XT. Using machine learning to support debugging with tarantula. In: Proc. of the 18th IEEE Int’l Symp. on Software Reliability (ISSRE’07). Trollhattan: IEEE, 2007. 137–146
    [37] Zhang XY, Gupta N, Gupta R. Locating faults through automated predicate switching. In: Proc. of the 28th Int’l Conf. on Software Engineering. Shanghai: ACM, 2006. 272–281.
    [38] Wang T, Roychoudhury A. Automated path generation for software fault localization. In: Proc. of the 20th IEEE/ACM Int’l Conf. on Automated Software Engineering. Long Beach: ACM, 2005. 347–351.
    [39] Liu C, Fei L, Yan XF, Han JW, Midkiff SP. Statistical debugging: A hypothesis testing-based approach. IEEE Transactions on Software Engineering, 2006, 32(10): 831–848. [doi: 10.1109/TSE.2006.105]
    [40] Wong WE, Debroy V, Xu DX. Towards better fault localization: A crosstab-based statistical approach. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 2012, 42(3): 378–396. [doi: 10.1109/TSMCC.2011.2118751]
    [41] Li X, Zhang LM. Transforming programs and tests in tandem for fault localization. Proceedings of the ACM on Programming Languages, 2017, 1(OOPSLA): 92. [doi: 10.1145/3133916]
    [42] Lou YL, Ghanbari A, Li X, Zhang LM, Zhang HT, Hao D, Zhang L. Can automated program repair refine fault localization? A unified debugging approach. In: Proc. of the 29th ACM SIGSOFT Int’l Symp. on Software Testing and Analysis. ACM, 2020. 75–87.
    [43] Yu K, Lin MX. Advances in automatic fault localization techniques. Chinese Journal of Computers, 2011, 34(8): 1411–1422(in Chinese with English abstract). [doi: 10.3724/SP.J.1016.2011.01411] 虞凯, 林梦香. 自动化软件错误定位技术研究进展. 计算机学报, 2011, 34(8): 1411–1422. [doi: 10.3724/SP.J.1016.2011.01411]
    [44] Chen X, Ju XL, Wen WZ, Gu Q. Review of dynamic fault localization approaches based on program spectrum. Ruan Jian Xue Bao/Journal of Software, 2015, 26(2): 390–412 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/4708.htm 陈翔, 鞠小林, 文万志, 顾庆. 基于程序频谱的动态缺陷定位方法研究. 软件学报, 2015, 26(2): 390–412. http://www.jos.org.cn/1000-9825/4708.htm
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

李乐平,张宇霞,刘辉.面向众包软件开发的错误定位方法.软件学报,2023,34(6):2690-2707

复制
分享
文章指标
  • 点击次数:750
  • 下载次数: 2215
  • HTML阅读次数: 1362
  • 引用次数: 0
历史
  • 收稿日期:2020-12-07
  • 最后修改日期:2021-03-25
  • 在线发布日期: 2022-11-24
  • 出版日期: 2023-06-06
文章二维码
您是第20053984位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号