一种基于强化学习的持续集成环境中测试用例排序技术
作者:
作者简介:

赵逸凡(1999-),男,博士生,CCF学生会员,主要研究领域为软件测试;郝丹(1979-),女,博士,副教授,博士生导师,CCF杰出会员,主要研究领域为软件测试

通讯作者:

郝丹,haodan@pku.edu.cn

中图分类号:

TP311

基金项目:

国家自然科学基金(61872008)


Test Case Prioritization Technique in Continuous Integration Based on Reinforcement Learning
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [45]
  • |
  • 相似文献 [20]
  • | | |
  • 文章评论
    摘要:

    在软件交付越来越强调迅速、可靠的当下,持续集成成为一项备受关注的技术.开发人员不断将工作副本集成到代码主干完成软件演化,每次集成会通过自动构建测试来验证代码更新是否引入错误.但随着软件规模的增大,测试用例集包含的测试用例越来越多,测试用例的覆盖范围、检错效果等特征也随着集成周期的延长而变化,传统的测试用例排序技术难以适用.基于强化学习的测试排序技术可以根据测试反馈动态调整排序策略,但现有的相关技术不能综合考虑测试用例集中的信息进行排序,这限制了它们的性能.提出一种新的基于强化学习的持续集成环境中测试用例排序方法——指针排序方法:方法使用测试用例的历史信息等特征作为输入,在每个集成周期中,智能体利用指针注意力机制获得对所有备选测试用例的关注程度,由此得到排序结果,并从测试执行的反馈得到策略更新的方向,在“排序-运行测试-反馈”的过程中不断调整排序策略,最终达到良好的排序性能.在5个规模较大的数据集上验证了所提方法的效果,并探究了使用的历史信息长度对方法性能的影响,方法在仅含回归测试用例的数据集上的排序效果,以及方法的执行效率.最后,得到如下结论:(1)与现有方法相比,指针排序方法能够随着软件版本的演化调整排序策略,在持续集成环境下有效地提升测试序列的检错能力.(2)指针排序方法对输入的历史信息长度有较好的鲁棒性,少量的历史信息即可使其达到最优效果.(3)指针排序方法能够很好地处理回归测试用例和新增测试用例.(4)指针排序方法的时间开销不大,结合其更好、更稳定的排序性能,可以认为指针排序方法是一个非常有竞争力的方法.

    Abstract:

    As software delivery increasingly emphasizes speed and reliability, continuous integration (CI) has attracted more and more attention these years. Developers continue to integrate working copies into the mainline to realize software evolution. Each integration involves automated tests to verify whether the update introduces faults. However, as the scale of software increases, test suites contain more and more test cases. As software evolves, the coverage and fault detection ability of test cases also change among different CI cycles. As a result, the traditional test case prioritization techniques may be inapplicable. Techniques based on reinforcement learning can adjust prioritization strategies dynamically according to test feedback. But the existing techniques based on reinforcement learning proposed in recent years do not comprehensively consider information in the test suite during prioritization, which limits their effectiveness. This study proposes a new test case prioritization method in CI, called pointer ranking method. The method uses features like history information of test cases as inputs. In each CI cycle, the agent uses the attention mechanism to gain attention to all candidate test cases, and then obtains a prioritization result. After test execution, it obtains the updating direction from the feedback. It constantly adjusts its prioritization strategy in the process “prioritization, test execution, test feedback” and finally achieves satisfied prioritization performance. This study verifies the effectiveness of the proposed method on five large-scale datasets, and explores the impact of history length on method performance. Besides, it explores the model’s effectiveness on datasets which only contain regression test cases and the model’s execution efficiency. Finally, the study comes to the following conclusions. First, compared to existing techniques, pointer ranking method can adjust its strategy along with the evolution of the software, and effectively enhance the fault detection ability of test sequence in CI. Second, pointer ranking method has good robustness to history length. A small amount of history information can make it achieve the optimal performance. Third, pointer ranking method can handle regression test cases and newly-added test cases well. Finally, pointer ranking method has little time overhead. Considering its better and more stable prioritization performance, pointer ranking method is a very competitive method.

    参考文献
    [1] Travis CI. 2018. https://travis-ci.org
    [2] Circle CI. 2018. https://circleci.com
    [3] Jenkins. 2018. https://jenkins.io
    [4] Buildbot. 2018. https://buildbot.net
    [5] Liang JJ, Elbaum S, Rothermel G. Redefining prioritization: Continuous prioritization for continuous integration. In: Proc. of the 40th IEEE/ACM Int’l Conf. on Software Engineering (ICSE). Gothenburg: ACM, 2018. 688–698.
    [6] Memon A, Gao ZB, Nguyen B, Dhanda S, Nickell E, Siemborski R, Micco J. Taming Google-scale continuous testing. In: Proc. of the 39th IEEE/ACM Int’l Conf. on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP). Buenos Aires: IEEE, 2017. 233–242.
    [7] Fowler M. Continuous integration, 2006. https://martinfowler.com/articles/continuousIntegration.html
    [8] Henard C, Papadakis M, Harman M, Jia M, Traon YL. Comparing white-box and black-box test prioritization. In: Proc. of the 38th IEEE/ACM Int’l Conf. on Software Engineering (ICSE). Austin: IEEE, 2016. 523–534.
    [9] Zhang L, Hao D, Zhang L, Rothermel G, Mei H. Bridging the gap between the total and additional test-case prioritization strategies. In: Proc. of the 35th Int’l Conf. on Software Engineering (ICSE). San Francisco: IEEE, 2013. 192–201.
    [10] Hao D, Zhang LM, Zhang L, Rothermel G, Mei H. A unified test case prioritization approach. ACM Transactions on Software Engineering and Methodology, 2014, 24(2): 10. [doi: 10.1145/2685614]
    [11] Hao D, Zhao X, Zhang L. Adaptive test-case prioritization guided by output inspection. In: Proc. of the 37th IEEE Annual Computer Software and Applications Conf. Kyoto: IEEE, 2013. 169–179.
    [12] Huang RB, Zhang QJ, Towey D, Sun WF, Chen JF. Regression test case prioritization by code combinations coverage. Journal of Systems and Software, 2020, 169: 110712. [doi: 10.1016/j.jss.2020.110712]
    [13] Mondal S, Nasre R. Colosseum: Regression test prioritization by delta displacement in test coverage. IEEE Trans. on Software Engineering, 2022, 48(10): 4060–4073.
    [14] Chi JL, Qu Y, Zheng QH, Yang ZJ, Jin WX, Cui D, Liu T. Relation-based test case prioritization for regression testing. Journal of Systems and Software, 2020, 163: 110539. [doi: 10.1016/j.jss.2020.110539]
    [15] Chen JJ, Lou YL, Zhang LM, Zhou JY, Wang XL, Hao D, Zhang L. Optimizing test prioritization via test distribution analysis. In: Proc. of the 26th ACM Joint Meeting on European Software Engineering Conf. and Symp. on the Foundations of Software Engineering. Lake Buena Vista: ACM, 2018. 656–667.
    [16] Lima JAP, Vergilio SR. Test case prioritization in continuous integration environments: A systematic mapping study. Information and Software Technology, 2020, 121: 106268. [doi: 10.1016/j.infsof.2020.106268]
    [17] Jin XH, Servant F. What helped, and what did not? An evaluation of the strategies to improve continuous integration. In: Proc. of the 43rd IEEE/ACM Int’l Conf. on Software Engineering (ICSE). Madrid: IEEE, 2021. 213–225.
    [18] Spieker H, Gotlieb A, Marijan D, Mossige M. Reinforcement learning for automatic test case prioritization and selection in continuous integration. In: Proc. of the 26th ACM SIGSOFT Int’l Symp. on Software Testing and Analysis. Santa Barbara: ACM, 2017. 12–22.
    [19] Busjaeger B, Xie T. Learning for test prioritization: An industrial case study. In: Proc. of the 24th ACM SIGSOFT Int’l Symp. on Foundations of Software Engineering. Seattle: ACM, 2016. 975–980.
    [20] Bertolino A, Guerriero A, Miranda B, Pietrantuono R. Learning-to-rank vs ranking-to-learn: Strategies for regression testing in continuous integration. In: Proc. of the 42nd ACM/IEEE Int’l Conf. on Software Engineering. Seoul: IEEE, 2020. 1–12.
    [21] He LL, Yang Y, Li Z, Zhao RL. Reward of reinforcement learning of test optimization for continuous integration. Ruan Jian Xue Bao/Journal of Software, 2019, 30(5): 1438-1449 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/5714.htm 何柳柳, 杨羊, 李征, 赵瑞莲. 面向持续集成测试优化的强化学习奖励机制. 软件学报, 2019, 30(5): 1438-1449. http://www.jos.org.cn/1000-9825/5714.htm
    [22] Yang Y, Li Z, He LL, Zhao RL. A systematic study of reward for reinforcement learning based continuous integration testing. Journal of Systems and Software, 2020, 170: 110787. [doi: 10.1016/j.jss.2020.110787]
    [23] Bagherzadeh M, Kahani N, Briand L. Reinforcement learning for test case prioritization. IEEE Trans. on Software Engineering, 2022, 48(8): 2836–2856.
    [24] Wong WE, Horgan JR, London S, Agrawal H. A study of effective regression testing in practice. In: Proc. of the 8th Int’l Symp. on Software Reliability Engineering. Albuquerque: IEEE, 1997. 264–274.
    [25] Elbaum S, Malishevsky AG, Rothermel G. Test case prioritization: A family of empirical studies. IEEE Transactions on Software Engineering, 2002, 28(2): 159–182. [doi: 10.1109/32.988497]
    [26] Rothermel G, Untch RH, Chu CY, Harrold MJ. Prioritizing test cases for regression testing. IEEE Transactions on Software Engineering, 2001, 27(10): 929–948. [doi: 10.1109/32.962562]
    [27] Li Z, Harman M, Hierons RM. Search algorithms for regression test case prioritization. IEEE Transactions on Software Engineering, 2007, 33(4): 225–237. [doi: 10.1109/TSE.2007.38]
    [28] Zhang L, Hou SS, Guo C, Xie T, Mei H. Time-aware test-case prioritization using integer linear programming. In: Proc. of the 18th Int’l Symp. on Software Testing and Analysis. Chicago: ACM, 2009. 213–224.
    [29] Elbaum S, Rothermel G, Penix J. Techniques for improving regression testing in continuous integration development environments. In: Proc. of the 22nd ACM SIGSOFT Int’l Symp. on Foundations of Software Engineering. Hong Kong: ACM, 2014. 235–245.
    [30] Yu TT, Wang T. A study of regression test selection in continuous integration environments. In: Proc. of the 29th IEEE Int’l Symp. on Software Reliability Engineering (ISSRE). Memphis: IEEE, 2018. 135–143.
    [31] Kim JM, Porter A. A history-based test prioritization technique for regression testing in resource constrained environments. In: Proc. of the 24th Int’l Conf. on Software Engineering. Orlando: IEEE, 2002. 119–129.
    [32] Marijan D, Gotlieb A, Sen S. Test case prioritization for continuous regression testing: An industrial case study. In: Proc. of the 2013 IEEE Int’l Conf. on Software Maintenance. Eindhoven: IEEE, 2013. 540–543.
    [33] Strandberg PE, Sundmark D, Afzal W, Ostrand TJ, Weyuker EJ. Experience report: Automated system level regression test prioritization using multiple factors. In: Proc. of the 27th IEEE Int’l Symp. on Software Reliability Engineering (ISSRE). Ottawa: IEEE, 2016: 12–23.
    [34] Srikanth H, Cashman M, Cohen MB. Test case prioritization of build acceptance tests for an enterprise cloud application: An industrial case study. Journal of Systems and Software, 2016, 119: 122–135. [doi: 10.1016/j.jss.2016.06.017]
    [35] Hemmati H, Fang ZH, Mäntylä MV, Adams B. Prioritizing manual test cases in rapid release environments. Software Testing, Verification and Reliability, 2017, 27(6): e1609. [doi: 10.1002/stvr.1609]
    [36] Haghighatkhah A, Mäntylä M, Oivo M, Kuvaja P. Test prioritization in continuous integration environments. Journal of Systems and Software, 2018, 146: 80–98. [doi: 10.1016/j.jss.2018.08.061]
    [37] Gligoric M, Eloussi L, Marinov D. Practical regression test selection with dynamic file dependencies. In: Proc. of the 2015 Int’l Symp. on Software Testing and Analysis. Baltimore: ACM, 2015. 211–222.
    [38] Sutton RS, Barto AG. Reinforcement Learning: An Introduction. Cambridge: MIT Press, 1998. 43–47.
    [39] Hao D, Zhang L, Zang L, Wang YB, Wu XX, Xie T. To be optimal or not in test-case prioritization. IEEE Transactions on Software Engineering, 2015, 42(5): 490–505. [doi: 10.1109/TSE.2015.2496939]
    [40] Hao D, Zhang L, Mei H. Test-case prioritization: Achievements and challenges. Frontiers of Computer Science, 2016, 10(5): 769–777. [doi: 10.1007/s11704-016-6112-3]
    [41] Vinyals O, Fortunato M, Jaitly N. Pointer networks. In: Proc. of the 28th Int’l Conf. on Neural Information Processing Systems. Montreal: NIPS, 2015. 2692–2700.
    [42] Bello I, Pham H, Le QV, Norouzi M, Bengio S. Neural combinatorial optimization with reinforcement learning. arXiv:1611.09940, 2016.
    [43] Qu X, Cohen MB, Woolf KM. Combinatorial interaction regression testing: A study of test case generation and prioritization. In: Proc. of the 2007 IEEE Int’l Conf. on Software Maintenance. Paris: IEEE, 2007. 255–264.
    [44] Mnih V, Badia AP, Mirza M, Graves A, Lillicrap T, Harley T, Silver D, Kavukcuoglu K. Asynchronous methods for deep reinforcement learning. In: Proc. of the 33rd Int’l Conf. on Machine Learning. New York: PMLR, 2016. 1928–1937.
    [45] Rothermel G, Untch RH, Chu CY, Harrold MJ. Test case prioritization: An empirical study. In: Proc. of the 1999 IEEE Int’l Conf. on Software Maintenance. Oxford: IEEE, 1999. 179–188.
    引证文献
引用本文

赵逸凡,郝丹.一种基于强化学习的持续集成环境中测试用例排序技术.软件学报,2023,34(6):2708-2726

复制
分享
文章指标
  • 点击次数:930
  • 下载次数: 2791
  • HTML阅读次数: 1645
  • 引用次数: 0
历史
  • 收稿日期:2021-07-13
  • 最后修改日期:2021-09-08
  • 在线发布日期: 2022-11-16
  • 出版日期: 2023-06-06
文章二维码
您是第20059326位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号