源码处理场景下人工智能系统鲁棒性验证方法

doi:10.13328/j.cnki.jos.006879

微信服务号

微信订阅号

2025年3月14日 20:51 星期五

首页 > 过刊浏览>2023年第34卷第9期 >4018-4036. DOI:10.13328/j.cnki.jos.006879

PDF HTML阅读 XML下载导出引用引用提醒

源码处理场景下人工智能系统鲁棒性验证方法
DOI:
                        10.13328/j.cnki.jos.006879
                    
CSTR:
                        
                    
作者:
                        杨焱景杨焱景
南京大学 软件学院, 江苏 南京 210093
在期刊界中查找
在百度中查找
在本站中查找
毛润丰毛润丰
南京大学 软件学院, 江苏 南京 210093
在期刊界中查找
在百度中查找
在本站中查找
谭睿谭睿
南京大学 软件学院, 江苏 南京 210093
在期刊界中查找
在百度中查找
在本站中查找
沈海峰沈海峰
Discipline of Information Technology, Peter Faber Business School, Australian Catholic University, Sydney NSW 2060, Australia
在期刊界中查找
在百度中查找
在本站中查找
荣国平荣国平
南京大学 软件学院, 江苏 南京 210093
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:杨焱景(1999-),男,博士生,CCF学生会员,主要研究领域为AI系统鲁棒性,AI系统安全;毛润丰(1996-),男,博士生,主要研究领域为DevSecOps,软件安全,漏洞预测;谭睿(2001-),女,硕士生,主要研究领域为机器学习,软件缺陷预测;沈海峰(1971-),男,博士,教授,博士生导师,主要研究领域为软件工程,人机交互,以人为本的人工智能,仿真与可视化;荣国平(1977-),男,博士,副研究员,CCF专业会员,主要研究领域为软件过程,DevOps,AIOps.
通讯作者:毛润丰,mrf@smail.nju.edu.cn
中图分类号:
基金项目:国家自然科学基金(62072227, 62202219); 国家重点研发计划(2019YFE0105500); 江苏省重点研发计划(BE2021002-2); 南京大学计算机软件新技术国家重点实验室创新项目(ZZKT2022A25); 海外开放课题(KFKT2022A09)

Robustness Verification Method for Artificial Intelligence Systems Based on Source Code Processing

Author:

YANG Yan-Jing
YANG Yan-Jing
Software Institute, Nanjing University, Nanjing 210093, China
在期刊界中查找
在百度中查找
在本站中查找
MAO Run-Feng
MAO Run-Feng
Software Institute, Nanjing University, Nanjing 210093, China
在期刊界中查找
在百度中查找
在本站中查找
TAN Rui
TAN Rui
Software Institute, Nanjing University, Nanjing 210093, China
在期刊界中查找
在百度中查找
在本站中查找
SHEN Hai-Feng
SHEN Hai-Feng
Discipline of Information Technology, Peter Faber Business School, Australian Catholic University, Sydney NSW 2060, Australia
在期刊界中查找
在百度中查找
在本站中查找
RONG Guo-Ping
RONG Guo-Ping
Software Institute, Nanjing University, Nanjing 210093, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献 [41]

相似文献 [20]

引证文献

资源附件

文章评论

摘要:

人工智能(artificial intelligence, AI)技术的发展为源码处理场景下AI系统提供了强有力的支撑. 相较于自然语言处理, 源码在语义空间上具有特殊性, 源码处理相关的机器学习任务通常采用抽象语法树、数据依赖图、控制流图等方式获取代码的结构化信息并进行特征抽取. 现有研究通过对源码结构的深入分析以及对分类器的灵活应用已经能够在实验场景下获得优秀的结果. 然而, 对于源码结构更为复杂的真实应用场景, 多数源码处理相关的AI系统出现性能滑坡, 难以在工业界落地, 这引发了从业者对于AI系统鲁棒性的思考. 由于基于AI技术开发的系统普遍是数据驱动的黑盒系统, 直接衡量该类软件系统的鲁棒性存在困难. 随着对抗攻击技术的兴起, 在自然语言处理领域已有学者针对不同任务设计对抗攻击来验证模型的鲁棒性并进行大规模的实证研究. 为了解决源码处理场景下AI系统在复杂代码场景下的不稳定性问题, 提出一种鲁棒性验证方法(robustness verification by Metropolis-Hastings attack method, RVMHM), 首先使用基于抽象语法树的代码预处理工具提取模型的变量池, 然后利用MHM源码攻击算法替换变量扰动模型的预测效果. 通过干扰数据和模型交互过程, 观察攻击前后的鲁棒性验证指标的变化量来衡量AI系统的鲁棒性. 以漏洞预测作为基于源码处理的二分类典型场景为例, 通过在3个开源项目的数据集上验证12组AI漏洞预测模型鲁棒性说明RVMHM方法针对源码处理场景下AI系统进行鲁棒性验证的有效性.

关键词:源码结构化分析;源码对抗攻击;AI系统鲁棒性验证

Abstract:

The development of artificial intelligence (AI) technology provides strong support for AI systems based on source code processing. Compared with natural language processing, source code is special in semantic space. Machine learning tasks related to source code processing usually employ abstract syntax trees, data dependency graphs, and control flow graphs to obtain the structured information of codes and extract features. Existing studies can obtain excellent results in experimental scenarios through in-depth analysis of source code structures and flexible application of classifiers. However, for real application scenarios where the source code structures are more complex, most of the AI systems related to source code processing have poor performance and are difficult to implement in the industry, which triggers practitioners to consider the robustness of AI systems. As AI-based systems are generally data-driven black box systems, it is difficult to directly measure the robustness of these software systems. With the emerging adversarial attack techniques, some scholars in natural language processing have designed adversarial attacks for different tasks to verify the robustness of models and conducted large-scale empirical studies. To solve the instability of AI systems based on source code processing in complex code scenarios, this study proposes robustness verification by Metropolis-Hastings attack method (RVMHM). Firstly, the code preprocessing tool based on abstract syntax trees is adopted to extract the variable pool of the model, and then the MHM source code attack algorithm is employed to replace the prediction effect of the variable perturbation model. The robustness of AI systems is measured by observing the changes in the robustness verification index before and after the attack by interfering with the data and model interaction process. With vulnerability prediction as a typical binary classification scenario of source code processing, this study verifies the robustness of 12 groups of AI vulnerability prediction models on three datasets of open source projects to illustrate the RVMHM effectiveness for robustness verification of source code processing based on AI systems.

Key words:code structure analysis;code adversarial attack;AI system quality evaluation

参考文献

[1] 纪守领, 杜天宇, 李进锋, 沈超, 李博. 机器学习模型安全与隐私研究综述. 软件学报, 2021, 32(1): 41-67. http://www.jos.org.cn/1000-9825/6131.htm

Ji SL, Du TY, Li JF, Shen C, Li B. Security and privacy of machine learning models: A survey. Ruan Jian Xue Bao/Journal of Software, 2021, 32(1): 41-67 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/6131.htm

[2] Lin GJ, Wen S, Han QL, Zhang J, Xiang Y. Software vulnerability detection using deep neural networks: A survey. Proceedings of the IEEE, 2020, 108(10): 1825-1848. [doi: 10.1109/JPROC.2020.2993293]

[3] Johnson B, Song Y, Murphy-Hill E, Bowdidge R. Why don’t software developers use static analysis tools to find bugs? In: Proc. of the 35th Int’l Conf. on Software Engineering (ICSE). San Francisco: IEEE, 2013. 672-681.

[4] Chakraborty S, Krishna R, Ding Y, Ray B. Deep learning based vulnerability detection: Are we there yet?. IEEE Transactions on Software Engineering, 2022, 48(9): 3280-3296. [doi: 10.1109/TSE.2021.3087402]

[5] Kuwajima H, Yasuoka H, Nakae T. Engineering problems in machine learning systems. Machine Learning, 2020, 109(5): 1103-1126. [doi: 10.1007/s10994-020-05872-w]

[6] Giray G. A software engineering perspective on engineering machine learning systems: State of the art and challenges. Journal of Systems and Software, 2021, 180: 111031. [doi: 10.1016/j.jss.2021.111031]

[7] Li SY, Guo JQ, Lou JG, Fan M, Liu T, Zhang DM. Testing machine learning systems in industry: An empirical study. In: Proc. of the 44th IEEE/ACM Int’l Conf. on Software Engineering: Software Engineering in Practice (ICSE-SEIP). Pittsburgh: IEEE, 2022. 263-272.

[8] Ishikawa F, Yoshioka N. How do engineers perceive difficulties in engineering of machine-learning systems?—Questionnaire survey. In: Proc. of the 7th IEEE/ACM Joint Int’l Workshop on Conducting Empirical Studies in Industry (CESI) and the 6th Int’l Workshop on Software Engineering Research and Industrial Practice (SER&IP). Montreal: IEEE, 2019. 2-9.

[9] Gui T, Wang X, Zhang Q, et al. TextFlint: Unified multilingual robustness evaluation toolkit for natural language processing. arXiv:2103.11441, 2021.

[10] Lin GJ, Zhang J, Luo W, Pan L, Vel OD, Montague P, Xiang Y. Software vulnerability discovery via learning multi-domain knowledge bases. IEEE Transactions on Dependable and Secure Computing, 2021, 18(5): 2469-2485. [doi: 10.1109/TDSC.2019.2954088]

[11] Feng HT, Fu XT, Sun HY, Wang H, Zhang YQ. Efficient vulnerability detection based on abstract syntax tree and deep learning. In: Proc. of the 2020 IEEE INFOCOM-IEEE Conf. on Computer Communications Workshops (INFOCOM WKSHPS). Toronto: IEEE, 2020. 722-727.

[12] Lee YJ, Choi SH, Kim C, Lim SH, Park KW. Learning binary code with deep learning to detect software weakness. In: Proc. of the 9th KSII Int’l Conf. on Internet (ICONI) Symp. 2017. 245-249.

[13] Russell R, Kim L, Hamilton L, Lazovich T, Harer J, Ozdemir O, Ellingwood P, McConley M. Automated vulnerability detection in source code using deep representation learning. In: Proc. of the 17th IEEE Int’l Conf. on Machine Learning and Applications (ICMLA). Orlando: IEEE, 2018. 757-762.

[14] Li Z, Zou DQ, Xu SH, Jin H, Zhu YW, Chen ZX. SySeVR: A framework for using deep learning to detect software vulnerabilities. IEEE Transactions on Dependable and Secure Computing, 2022, 19(4): 2244-2258. [doi: 10.1109/TDSC.2021.3051525]

[15] Yefet N, Alon U, Yahav E. Adversarial examples for models of code. Proceedings of the ACM on Programming Languages, 2020, 4: 162. [doi: 10.1145/3428230]

[16] Polyzotis N, Roy S, Whang SE, Zinkevich M. Data management challenges in production machine learning. In: Proc. of the 2017 ACM Int’l Conf. on Management of Data. Chicago: ACM, 2017. 1723-1726.

[17] Amershi S, Begel A, Bird C, DeLine R, Gall H, Kamar E, Nagappan N, Nushi B, Zimmermann T. Software engineering for machine learning: A case study. In: Proc. of the 41st IEEE/ACM Int’l Conf. on Software Engineering: Software Engineering in Practice (ICSE-SEIP). Montreal: IEEE, 2019. 291-300.

[18] 纪守领, 杜天宇, 邓水光, 程鹏, 时杰, 杨珉, 李博. 深度学习模型鲁棒性研究综述. 计算机学报, 2022, 45(1): 190-206. [doi: 10.11897/SP.J.1016.2022.00190]

Ji SL, Du TY, Deng SG, Cheng P, Shi J, Yang M, Li B. Robustness certification research on deep learning models: A survey. Chinese Journal of Computers, 2022, 45(1): 190-206 (in Chinese with English abstract). [doi: 10.11897/SP.J.1016.2022.00190]

[19] Carlini N, Wagner D. Towards evaluating the robustness of neural networks. In: Proc. of the 2017 IEEE Symp. on Security and Privacy (SP). San Jose: IEEE, 2017. 39-57.

[20] Zhang HZ, Li Z, Li G, Ma L, Liu Y, Jin Z. Generating adversarial examples for holding robustness of source code processing models. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(1): 1169-1176. [doi: 10.1609/aaai.v34i01.5469]

[21] Liu SG, Lin GJ, Han QL, Wen S, Zhang J, Xiang Y. DeepBalance: Deep-learning and fuzzy oversampling for vulnerability detection. IEEE Transactions on Fuzzy Systems, 2020, 28(7): 1329-1343. [doi: 10.1109/TFUZZ.2019.2958558]

[22] Poth A, Meyer B, Schlicht P, Riel A. Quality assurance for machine learning-an approach to function and system safeguarding. In: Proc. of the 20th IEEE Int’l Conf. on Software Quality, Reliability and Security (QRS). Macao: IEEE, 2020. 22-29.

[23] Goodfellow IJ, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y. Generative adversarial nets. In: Proc. of the 27th Int’l Conf. on Neural Information Processing Systems. Montreal: MIT Press, 2014. 2672-2680.

[24] Li Z, Zou DQ, Xu SH, Ou XY, Jin H, Wang SJ, Deng ZJ, Zhong YY. VulDeePecker: A deep learning-based system for vulnerability detection. arXiv:1801.01681, 2018.

[25] Yamaguchi F, Golde N, Arp D, Rieck K. Modeling and discovering vulnerabilities with code property graphs. In: Proc. of the 2014 IEEE Symp. on Security and Privacy. Berkeley: IEEE, 2014. 590-604.

[26] Biggio B, Corona I, Maiorca D, Nelson B, Šrndić N, Laskov P, Giacinto G, Roli F. Evasion attacks against machine learning at test time. In: Proc. of the 2013 European Conf. on Machine Learning and Knowledge Discovery in Databases. Prague: Springer, 2013. 387-402.

[27] Carlini N, Wagner D. Adversarial examples are not easily detected: Bypassing ten detection methods. In: Proc. of the 10th ACM Workshop on Artificial Intelligence and Security. Dallas: ACM, 2017. 3-14.

[28] Papernot N, McDaniel P, Goodfellow I, Jha S, Celik ZB, Swami A. Practical black-box attacks against machine learning. In: Proc. of the 2017 ACM on Asia Conf. on Computer and Communications Security. Abu Dhabi: ACM, 2017. 506-519.

[29] Alzantot M, Sharma Y, Elgohary A, Ho BJ, Srivastava M, Chang KW. Generating natural language adversarial examples. arXiv:1804.07998, 2018.

[30] Wei MS, Huang YC, Yang JQ, Wang JJ, Wang S. CoCoFuzzing: Testing neural code models with coverage-guided fuzzing. arXiv: 2106.09242, 2021.

[31] Chen S, Xue MH, Fan LL, Hao S, Xu LH, Zhu HJ, Li B. Automated poisoning attacks and defenses in malware detection systems: An adversarial machine learning approach. Computers & Security, 2018, 73: 326-344. [doi: 10.1016/j.cose.2017.11.007]

[32] Shu R, Xia TP, Williams L, Menzies T. Omni: Automated ensemble with unexpected models against adversarial evasion attack. Empirical Software Engineering, 2022, 27(1): 26. [doi: 10.1007/S10664-021-10064-8]

[33] Goodfellow IJ, Shlens J, Szegedy C. Explaining and harnessing adversarial examples. arXiv:1412.6572, 2014.

[34] Li JF, Ji SL, Du TY, Li B, Wang T. TextBugger: Generating adversarial text against real-world applications. arXiv:1812.05271, 2018.

[35] Guo PJ. Online python tutor: Embeddable web-based program visualization for cs education. In: Proc. of the 44th ACM Technical Symp. on Computer Science Education. Denver: ACM, 2013. 579-584.

[36] Boyd K, Eng KH, Page CD. Area under the precision-recall curve: Point estimates and confidence intervals. In: Proc. of the 2013 European Conf. on Machine Learning and Knowledge Discovery in Databases. Prague: Springer, 2013. 451-466.

[37] Goadrich M, Oliphant L, Shavlik J. Gleaner: Creating ensembles of first-order clauses to improve recall-precision curves. Machine Learning, 2006, 64(1): 231-261. [doi: 10.1007/s10994-006-8958-3]

[38] Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow I, Fergus R. Intriguing properties of neural networks. arXiv:1312.6199, 2014.

[39] Lin GJ, Zhang J, Luo W, Pan L, Xiang Y. POSTER: Vulnerability discovery with function representation learning from unlabeled projects. In: Proc. of the 2017 ACM SIGSAC Conf. on Computer and Communications Security. Dallas: ACM, 2017. 2539-2541.

引用本文

杨焱景,毛润丰,谭睿,沈海峰,荣国平.源码处理场景下人工智能系统鲁棒性验证方法.软件学报,2023,34(9):4018-4036

复制

文章指标

点击次数:1523
下载次数: 4973
HTML阅读次数: 2385
引用次数: 0

历史

收稿日期:2022-09-05
最后修改日期:2022-10-13
录用日期:
在线发布日期: 2023-01-13
出版日期: 2023-09-06

微信服务号

微信订阅号

引用本文

分享

文章指标

历史

文章二维码

微信服务号

微信订阅号

引用本文

分享

微信扫一扫：分享

文章指标

历史

文章二维码