基于前馈神经网络的编译器测试用例生成方法
作者:
作者简介:

徐浩然(1993-),男,博士生,主要研究领域为软件安全;解培岱(1985-),男,博士,副教授,主要研究领域为软件安全,恶意代码检测;王勇军(1971-),男,博士,教授,博士生导师,CCF高级会员,主要研究领域为软件安全分析,网络威胁行为分析与检测;范书珲(1993-),女,博士生,主要研究领域为区块链安全,软件安全;黄志坚(1989-),男,博士,工程师,主要研究领域为软件测试,网络安全

通讯作者:

王勇军,E-mail:wwyyjj1971@126.com

中图分类号:

TP311

基金项目:

国家自然科学基金(61472439);国家重点研发计划(2018YFB0204301)


Compiler Fuzzing Test Case Generation with Feed-forward Neural Network
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [35]
  • |
  • 相似文献 [20]
  • | | |
  • 文章评论
    摘要:

    编译器模糊测试,是测试编译器功能性与安全性的常用技术之一.模糊测试器通过产生语法正确的测试用例,对编译器的深层代码展开测试.近来,基于循环神经网络的深度学习模型被引入编译器模糊测试用例生成过程.针对现有方法生成测试用例的语法正确率不足、生成效率低的问题,提出一种基于前馈神经网络的编译器模糊测试用例生成方法,并设计实现了原型工具FAIR.与现有的基于token序列学习的方法不同,FAIR从抽象语法树中提取代码片段,利用基于自注意力的前馈神经网络捕获代码片段之间的语法关联,通过学习程序设计语言的生成式模型,自动生成多样化的测试用例.实验结果表明,FAIR生成测试用例的解析通过率以及生成效率均优于同类型先进方法.该方法显著提升了检测编译器软件缺陷的能力,已成功检测出GCC和LLVM的20处软件缺陷.此外,该方法具有良好的可移植性,简单移植后的FAIR-JS已在JavaScript引擎中检测到两处软件缺陷.

    Abstract:

    Compiler fuzzing is one of the commonly used techniques to test the functionality and safety of compilers. The fuzzer produces grammatically correct test cases to test the deep parts of the compiler. Recently, recurrent neural networks-based deep learning methods have been introduced to the test case generation process. Aiming at the problems of insufficient grammatical accuracy and low generation efficiency when generating test cases, a method for generating compiler fuzzing test cases is proposed based on feed-forward neural networks, and the prototype tool FAIR is designed and implemented. Different from the method based on token sequence learning, FAIR extracts code fragments from the abstract syntax tree, and uses a self-attention-based feed-forward neural network to capture the grammatical associations between code fragments. After learning a generative model of the programming language, fair automatically produce diverse test cases. Experimental results show that FAIR is superior to its competitors in terms of grammatical accuracy and generation efficiency of generating test cases. The proposed method has significantly improved the ability to detect compiler software defects, and has successfully detected 20 software defects in GCC and LLVM. In addition, the method has soundportability. The simple ported FAIR-JS has detected 2 defects in the JavaScript engine.

    参考文献
    [1] Chen JJ, Patra J, Pradel M, Xiong YF, Zhang HY, Hao D, Zhang L. A survey of compiler testing. ACM Computing Surveys (CSUR), 2020, 53(1):1-36.
    [2] Cummins C, Petoumenos P, Murray A, Leather, H. Compiler fuzzing through deep learning. In:Proc. of the 27th ACM SIGSOFT Int'l Symp. on Software Testing and Analysis. 2018. 95-105.
    [3] Yang XJ, Chen Y, Eide E, Regehr J. Finding and understanding bugs in C compilers. In:Proc. of the 32nd ACM SIGPLAN Conf. on Programming Language Design and Implementation. 2011. 283-294.
    [4] Kolen JF, Kremer SC. Gradient flow in recurrent nets:The difficulty of learning LongTerm dependencies. In:A Field Guide to Dynamical Recurrent Networks. 2001. 237-243.[doi:10.1109/9780470544037.ch14]
    [5] Liu X, Li X, Prajapati R, Wu D. Deepfuzz:Automatic generation of syntax valid C programs for fuzz testing. In:Proc. of the AAAI Conf. on Artificial Intelligence. 2019, 33(1):1044-1051.
    [6] Le V, Afshari M, Su ZD. Compiler validation via equivalence modulo inputs. ACM SIGPLAN Notices, 2014, 49(6):216-226.
    [7] Le V, Sun CN, Su ZD. Finding deep compiler bugs via guided stochastic program mutation. ACM SIGPLAN Notices, 2015, 50(10):386-399.
    [8] Sun CN, Le V, Su ZD. Finding compiler bugs via live code mutation. In:Proc. of the 2016 ACM SIGPLAN Int'l Conf. on Object-oriented Programming, Systems, Languages, and Applications. 2016. 849-863.
    [9] Chen P, Chen H. Angora:Efficient fuzzing by principled search. In:Proc. of the 2018 IEEE Symp. on Security and Privacy (SP). IEEE, 2018. 711-725.
    [10] Holler C, Herzig K, Zeller A. Fuzzing with code fragments. In:Proc. of the 21st{USENIX}Security Symp.({USENIX}Security 2012). 2012. 445-458.
    [11] Wang NY, Ye YX, Liu L, Feng LZ, Bao T, Peng T. Language models based on deep learning:A review. Ruan Jian Xue Bao/Journal of Software, 2021, 32(4):1082-1115(in Chinese with English abstract). http://www.jos.org.cn/1000-9825/6169.htm[doi:10.13328/j.cnki.jos.006169]
    [12] Sutskever I, Martens J, Hinton GE. Generating text with recurrent neural networks. In:Proc. of the ICML. 2011.
    [13] Godefroid P, Peleg H, Singh R. Learn&fuzz:Machine learning for input fuzzing. In:Proc. of the 32nd IEEE/ACM Int'l Conf. on Automated Software Engineering (ASE). IEEE, 2017. 50-59.
    [14] Hochreiter S, Schmidhuber J. Long short-term memory. Neural Computation, 1997, 9(8):1735-1780.
    [15] Karpathy A, Johnson J, Fei-Fei L. Visualizing and understanding recurrent networks. arXiv preprint arXiv:1506.02078, 2015.
    [16] Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409. 0473, 2014.
    [17] Luong MT, Pham H, Manning CD. Effective approaches to attention-based neural machine translation. arXiv preprint arXiv:1508. 04025, 2015.
    [18] Salton G, Ross R, Kelleher J. Attentive language models. In:Proc. of the 8th Int'l Joint Conf. on Natural Language Processing (Vol.1:Long Papers). 2017. 441-450.
    [19] Al-Rfou R, Choe D, Constant N, Guo M, Jones L. Character-level language modeling with deeper self-attention. In:Proc. of the AAAI Conf. on Artificial Intelligence. 2019, 33(1):3159-3166.
    [20] Zheng W, Chen JZ, Wu XX, Chen X, Xia X. Empirical studies on deep-learning-based security bug report prediction methods. Ruan Jian Xue Bao/Journal of Software, 2020, 31(5):1294-1313(in Chinese with English abstract). http://www.jos.org.cn/1000-9825/5954.htm[doi:10.13328/j.cnki.jos.005954]
    [21] Rush AM, Chopra S, Weston J. A neural attention model for abstractive sentence summarization. arXiv preprint arXiv:1509.00685, 2015.
    [22] Paulus R, Xiong C, Socher R. A deep reinforced model for abstractive summarization. arXiv preprint arXiv:1705.04304, 2017.
    [23] Hu X, Li G, Liu F, Jin Z. Program generation and code completion techniques based on deep learning:Literature review. Ruan Jian Xue Bao/Journal of Software, 2019, 30(5):1206-1223(in Chinese with English abstract). http://www.jos.org.cn/1000-9825/5717. htm[doi:10.13328/j.cnki.jos.005717]
    [24] Alon U, Zilberstein M, Levy O, Yahav E. A general path-based representation for predicting program properties. ACM SIGPLAN Notices, 2018, 53(4):404-419.
    [25] Alon U, Sadaka R, Levy O, Yahav E. Structural language models of code. In:Proc. of the Int'l Conf. on Machine Learning. PMLR, 2020. 245-256.
    [26] Chen ZX, Zou DQ, Li Z, Jin H. Intelligent vulnerability detection system based on abstract syntax tree. Journal of Cyber Security, 2020, 5(4):1-13(in Chinese with English abstract).
    [27] Wang XM, Zhang T, Xin W, Hou CY. Source code defect detection based on deep learning. Journal of Beijing Institute of Technology (Natural Science Edition), 2019, 39(11):1155-1159(in Chinese with English abstract).
    [28] Veggalam S, Rawat S, Haller I, Bos H. Ifuzzer:An evolutionary interpreter fuzzer using genetic programming. In:Proc. of the European Symp. on Research in Computer Security. Cham:Springer, 2016. 581-601.
    [29] Han HS, Oh DH, Cha SK. CodeAlchemist:Semantics-aware code generation to find vulnerabilities in JavaScript engines. In:Proc. of the NDSS. 2019.
    [30] Lee S, Han HS, Cha SK, Son S. Montage:A neural network language model-guided javascript engine fuzzer. In:Proc. of the 29th{USENIX}Security Symp.({USENIX}Security 2020). 2020. 2613-2630.
    附中文参考文献:
    [11] 王乃钰,叶育鑫,刘露,凤丽洲,包铁,彭涛.基于深度学习的语言模型研究进展.软件学报, 2021, 32(4):1082-1115. http://www.jos.org.cn/1000-9825/6169.htm[doi:10.13328/j.cnki.jos.006169]
    [20] 郑炜,陈军正,吴潇雪,陈翔,夏鑫.基于深度学习的安全缺陷报告预测方法实证研究.软件学报, 2020, 31(5):1294-1313. http://www.jos.org.cn/1000-9825/5954.htm[doi:10.13328/j.cnki.jos.005954]
    [23] 胡星,李戈,刘芳,金芝.基于深度学习的程序生成与补全技术研究进展.软件学报, 2019, 30(5):1206-1223. http://www.jos. org.cn/1000-9825/5717.htm[doi:10.13328/j.cnki.jos.005717]
    [26] 陈肇炫,邹德清,李珍,金海.基于抽象语法树的智能化漏洞检测系统.信息安全学报, 2020, 5(4):1-13.
    [27] 王晓萌,张涛,辛伟,侯长玉.深度学习源代码缺陷检测方法.北京理工大学学报, 2019, 39(11):1155-1159.
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

徐浩然,王勇军,黄志坚,解培岱,范书珲.基于前馈神经网络的编译器测试用例生成方法.软件学报,2022,33(6):1996-2011

复制
分享
文章指标
  • 点击次数:1930
  • 下载次数: 4575
  • HTML阅读次数: 2981
  • 引用次数: 0
历史
  • 收稿日期:2021-09-05
  • 最后修改日期:2021-10-15
  • 在线发布日期: 2022-01-28
  • 出版日期: 2022-06-06
文章二维码
您是第19753550位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号