基于结构感知图神经网络的多类别漏洞检测

doi:10.13328/j.cnki.jos.007375

微信服务号

微信订阅号

2025年7月15日 10:36 星期二

首页 > 过刊浏览>年第卷第期 >1-17. DOI:10.13328/j.cnki.jos.007375

PDF HTML阅读 XML下载导出引用引用提醒

基于结构感知图神经网络的多类别漏洞检测
DOI:
                        10.13328/j.cnki.jos.007375
                    
CSTR:
                        
                    
作者:
                        曹思聪曹思聪
扬州大学 信息工程学院, 江苏 扬州 225127
在期刊界中查找
在百度中查找
在本站中查找
孙小兵孙小兵
扬州大学 信息工程学院, 江苏 扬州 225127
在期刊界中查找
在百度中查找
在本站中查找
薄莉莉薄莉莉
扬州大学 信息工程学院, 江苏 扬州 225127
在期刊界中查找
在百度中查找
在本站中查找
吴潇雪吴潇雪
扬州大学 信息工程学院, 江苏 扬州 225127
在期刊界中查找
在百度中查找
在本站中查找
李斌李斌
扬州大学 信息工程学院, 江苏 扬州 225127
在期刊界中查找
在百度中查找
在本站中查找
陈厅陈厅
电子科技大学 计算机科学与工程学院, 四川 成都 611731
在期刊界中查找
在百度中查找
在本站中查找
罗夏朴罗夏朴
香港理工大学 计算机系, 香港 999077
在期刊界中查找
在百度中查找
在本站中查找
张涛张涛
澳门科技大学 计算机科学与工程学院, 澳门 999078
在期刊界中查找
在百度中查找
在本站中查找
刘维刘维
扬州大学 信息工程学院, 江苏 扬州 225127
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:
通讯作者:
中图分类号:TP311
基金项目:国家自然科学基金(62202414); 江苏省“六大人才高峰”高层次人才项目(RJFW-053); 江苏省“333”工程中青年科学技术带头人项目; 云南省软件工程重点实验室开放基金(2023SE201)

Multi-class Vulnerability Detection with Structure-aware Graph Neural Network

Author:

CAO Si-Cong
CAO Si-Cong
College of Information Engineering, Yangzhou University, Yangzhou 225127, China
在期刊界中查找
在百度中查找
在本站中查找
SUN Xiao-Bing
SUN Xiao-Bing
College of Information Engineering, Yangzhou University, Yangzhou 225127, China
在期刊界中查找
在百度中查找
在本站中查找
BO Li-Li
BO Li-Li
College of Information Engineering, Yangzhou University, Yangzhou 225127, China
在期刊界中查找
在百度中查找
在本站中查找
WU Xiao-Xue
WU Xiao-Xue
College of Information Engineering, Yangzhou University, Yangzhou 225127, China
在期刊界中查找
在百度中查找
在本站中查找
LI Bin
LI Bin
College of Information Engineering, Yangzhou University, Yangzhou 225127, China
在期刊界中查找
在百度中查找
在本站中查找
CHEN Ting
CHEN Ting
School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China
在期刊界中查找
在百度中查找
在本站中查找
LUO Xia-Pu
LUO Xia-Pu
Department of Computing, Hong Kong Polytechnic University, Hong Kong 999077, China
在期刊界中查找
在百度中查找
在本站中查找
ZHANG Tao
ZHANG Tao
Faculty of Information Technology, Macau University of Science and Technology, Macao 999078, China
在期刊界中查找
在百度中查找
在本站中查找
LIU Wei
LIU Wei
College of Information Engineering, Yangzhou University, Yangzhou 225127, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献 [51]

相似文献

引证文献

资源附件

文章评论

摘要:

软件漏洞威胁着现实世界系统的安全. 近年来, 基于学习的漏洞检测方法(尤其是基于深度学习的方法)由于其从大量漏洞样本中挖掘隐式漏洞特征的显著优势, 得到了广泛的研究. 然而, 由于不同类型漏洞之间的特征差异和数据分布不平衡问题, 现有基于深度学习的漏洞检测方法难以准确识别具体的漏洞类型. 因此, 提出一种基于深度学习的多类型漏洞检测方法MulVD. MulVD构建了一种新型的结构感知图神经网络(SA-GNN), 它可以自适应地为不同类型的漏洞提取局部典型的漏洞模式, 并在不引入噪声的情况下重新平衡数据分布. 检验所提方法在二分类和多分类漏洞检测任务中的有效性. 实验结果表明, MulVD显著提高了现有基于深度学习的漏洞检测技术的性能.

关键词:漏洞检测;注意力机制;图神经网络;多类别分类

Abstract:

Software vulnerabilities pose significant threats to real-world systems. In recent years, learning-based vulnerability detection methods, especially deep learning-based approaches, have gained widespread attention due to their ability to extract implicit vulnerability features from large-scale vulnerability samples. However, due to differences in features among different types of vulnerabilities and the problem of imbalanced data distribution, existing deep learning-based vulnerability detection methods struggle to accurately identify specific vulnerability types. To address this issue, this study proposes MulVD, a deep learning-based multi-class vulnerability detection method. MulVD constructs a structure-aware graph neural network (SA-GNN) that can adaptively extract local and representative vulnerability patterns while rebalancing the data distribution without introducing noise. The effectiveness of the proposed approach in both binary and multi-class vulnerability detection tasks is evaluated. Experimental results demonstrate that MulVD significantly improves the performance of existing deep learning-based vulnerability detection techniques.

Key words:vulnerability detection;attention mechanism;graph neural network (GNN);multi-class classification

参考文献

[1] 刘剑, 苏璞睿, 杨珉, 和亮, 张源, 朱雪阳, 林惠民. 软件与网络安全研究综述. 软件学报, 2018, 29(1): 42–68. http://www.jos.org.cn/1000-9825/5320.htm

Liu J, Su PR, Yang M, He L, Zhang Y, Zhu XY, Lin HM. Software and cyber security––A survey. Ruan Jian Xue Bao/Journal of Software, 2018, 29(1): 42–68 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/5320.htm

[2] 李广威, 袁挺, 李炼. 开源C/C++静态软件缺陷检测工具实证研究. 软件学报, 2022, 33(6): 2061–2081. http://www.jos.org.cn/1000-9825/6569.htm

Li GW, Yuan T, Li L. Study of state-of-the-art open-source C/C++ static analysis tools. Ruan Jian Xue Bao/Journal of Software, 2022, 33(6): 2061–2081 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/6569.htm

[3] 邓枭, 叶蔚, 谢睿, 张世琨. 基于深度学习的源代码缺陷检测研究综述. 软件学报, 2023, 34(2): 625–654. http://www.jos.org.cn/1000-9825/6696.htm

Deng X, Ye W, Xie R, Zhang SK. Survey of source code bug detection based on deep learning. Ruan Jian Xue Bao/Journal of Software, 2023, 34(2): 625–654 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/6696.htm

[4] 顾绵雪, 孙鸿宇, 韩丹, 杨粟, 曹婉莹, 郭祯, 曹春杰, 王文杰, 张玉清. 基于深度学习的软件安全漏洞挖掘. 计算机研究与发展, 2021, 58(10): 2140–2162.

Gu MX, Sun HY, Han D, Yang S, Cao WY, Guo Z, Cao CJ, Wang WJ, Zhang YQ. Software security vulnerability mining based on deep learning. Journal of Computer Research and Development, 2021, 58(10): 2140–2162 (in Chinese with English abstract).

[5] 段旭, 吴敬征, 罗天悦, 杨牧天, 武延军. 基于代码属性图及注意力双向LSTM的漏洞挖掘方法. 软件学报, 2020, 31(11): 3404–3420. http://www.jos.org.cn/1000-9825/6061.htm

Duan X, Wu JZ, Luo TY, Yang MT, Wu YJ. Vulnerability mining method based on code property graph and attention BiLSTM. Ruan Jian Xue Bao/Journal of Software, 2020, 31(11): 3404–3420 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/6061.htm

[6] Li Z, Zou DQ, Xu SH, Ou XY, Jin H, Wang SJ, Deng ZJ, Zhong YY. VulDeePecker: A deep learning-based system for vulnerability detection. In: Proc. of the 25th Annual Network and Distributed System Security Symp. San Diego: NDSS, 2018. [doi: 10.14722/ndss.2018.23158]

[7] Cao SC, Sun XB, Bo LL, Wei Y, Li B. BGNN4VD: Constructing bidirectional graph neural-network for vulnerability detection. Information and Software Technology, 2021, 136: 106576.

[8] Cheng X, Wang HY, Hua JY, Xu GA, Sui YL. DeepWukong: Statically detecting software vulnerabilities using deep graph neural network. ACM Trans. on Software Engineering and Methodology, 2021, 30(3): 38.

[9] Wang HT, Ye GX, Tang ZY, Tan SH, Huang SF, Fang DY, Feng YS, Bian LZ, Wang Z. Combining graph-based learning with automated data collection for code vulnerability detection. IEEE Trans. on Information Forensics and Security, 2021, 16: 1943–1958.

[10] Cao SC, Sun XB, Bo LL, Wu RX, Li B, Tao CQ. MVD: Memory-related vulnerability detection based on flow-sensitive graph neural networks. In: Proc. of the 44th Int’l Conf. on Software Engineering. Pittsburgh: ACM, 2022. 1456–1468. [doi: 10.1145/3510003.3510219]

[11] Zheng W, Gao JL, Wu XX, Liu FY, Xun YX, Liu GL, Chen X. The impact factors on the performance of machine learning-based vulnerability detection: A comparative study. Journal of Systems and Software, 2020, 168: 110659.

[12] Zou DQ, Wang SJ, Xu SH, Li Z, Jin H. μVulDeePecker: A deep learning-based system for multiclass vulnerability detection. IEEE Trans. on Dependable and Secure Computing, 2021, 18(5): 2224–2236. [doi: 10.1109/TDSC.2019.2942930]

[13] Liu BC, Meng GZ, Zou W, Gong Q, Li F, Lin M, Sun DD, Huo W, Zhang C. A large-scale empirical study on vulnerability distribution within projects and the lessons learned. In: Proc. of the 42nd Int’l Conf. on Software Engineering. Seoul: ACM, 2020. 1547–1559. [doi: 10.1145/3377811.3380923]

[14] Yamaguchi F, Golde N, Arp D, Rieck K. Modeling and discovering vulnerabilities with code property graphs. In: Proc. of the 35th IEEE Symp. on Security and Privacy. Berkeley: IEEE, 2014. 590–604. [doi: 10.1109/SP.2014.44]

[15] Mikolov T, Sutskever I, Chen K, Corrado G, Dean J. Distributed representations of words and phrases and their compositionality. In: Proc. of the 27th Int’l Conf. on Neural Information Processing Systems. Lake Tahoe: ACM, 2013. 3111–3119.

[16] Grover A, Leskovec J. Node2vec: Scalable feature learning for networks. In: Proc. of the 22nd ACM SIGKDD Int’l Conf. on Knowledge Discovery and Data Mining. San Francisco: ACM, 2016. 855–864. [doi: 10.1145/2939672.2939754]

[17] Flawfinder. 2023. http://www.dwheeler.com/flawfinder/

[18] Rough-auditing-tool-for-security. 2023. https://code.google.com/archive/p/rough-auditing-tool-for-security/

[19] Cppcheck. 2023. http://cppcheck.net/

[20] Li Z, Zou DQ, Xu SH, Jin H, Zhu YW, Chen ZX. SySeVR: A framework for using deep learning to detect software vulnerabilities. IEEE Trans. on Dependable and Secure Computing, 2022, 19(4): 2244–2258.

[21] Zhou YQ, Liu SQ, Siow JK, Du XN, Liu Y. Devign: Effective vulnerability identification by learning comprehensive program semantics via graph neural networks. In: Proc. of the 33rd Int’l Conf. on Neural Information Processing Systems. Vancouver: NeurIPS, 2019. 915.

[22] Chakraborty S, Krishna R, Ding YRB, Ray B. Deep learning based vulnerability detection: Are we there yet? IEEE Trans. on Software Engineering, 2022, 48(9): 3280–3296. [doi: 10.1109/TSE.2021.3087402]

[23] Fu M, Tantithamthavorn C. LineVul: A Transformer-based line-level vulnerability prediction. In: Proc. of the 19th Int’l Conf. on Mining Software Repositories. Pittsburgh: ACM, 2022. 608–620. [doi: 10.1145/3524842.3528452]

[24] Cao SC, Sun XB, Wu XX, Lo D, Bo LL, Li B, Liu W. Coca: Improving and explaining graph neural network-based vulnerability detection systems. In: Proc. of the 46th IEEE/ACM Int’l Conf. on Software Engineering. Lisbon: ACM, 2024. 155. [doi: 10.1145/3597503.3639168]

[25] Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaisaer ?, Polosukhin I. Attention is all you need. In: Proc. of the 31st Int’l Conf. on Neural Information Processing Systems. Long Beach: NeurIPS, 2017. 6000–6010.

[26] Dam HK, Tran T, Pham T, Ng SW, Grundy J, Ghose A. Automatic feature learning for predicting vulnerable software components. IEEE Trans. on Software Engineering, 2021, 47(1): 67–85.

[27] Russell R, Kim L, Hamilton L, Lazovich T, Harer J, Ozdemir O, Ellingwood P, McConley M. Automated vulnerability detection in source code using deep representation learning. In: Proc. of the 17th IEEE Int’l Conf. on Machine Learning and Applications. Orlando: IEEE, 2018. 757–762. [doi: 10.1109/ICMLA.2018.00120]

[28] Cai J, Li B, Zhang T, Zhang JL, Sun XB. Fine-grained smart contract vulnerability detection by heterogeneous code feature learning and automated dataset construction. Journal of Systems and Software, 2024, 209: 111919.

[29] Kipf TN, Welling M. Semi-supervised classification with graph convolutional networks. In: Proc. of the 5th Int’l Conf. on Learning Representations. Toulon: OpenReview.net, 2017.

[30] Li YJ, Tarlow D, Brockschmidt M, Zemel RS. Gated graph sequence neural networks. In: Proc. of the 4th Int’l Conf. on Learning Representations. San Juan: OpenReview.net, 2016.

[31] Velickovic P, Cucurull G, Casanova A, Romero A, Liò P, Bengio Y. Graph attention networks. In: Proc. of the 6th Int’l Conf. on Learning Representations. Vancouver: OpenReview.net, 2018.

[32] Liu SG, Lin GJ, Han QL, Wen S, Zhang J, Xiang Y. DeepBalance: Deep-learning and fuzzy oversampling for vulnerability detection. IEEE Trans. on Fuzzy Systems, 2020, 28(7): 1329–1343.

[33] Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP. SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 2002, 16: 321–357.

[34] Tantithamthavorn C, Hassan AE, Matsumoto K. The impact of class rebalancing techniques on the performance and interpretation of defect prediction models. IEEE Trans. on Software Engineering, 2020, 46(11): 1200–1219.

[35] Wu XX, Zheng W, Chen X, Zhao Y, Yu TT, Mu DJ. Improving high-impact bug report prediction with combination of interactive machine learning and active learning. Information and Software Technology, 2021, 133: 106530.

[36] Yang X, Wang SW, Li Y, Wang SH. Does data sampling improve deep learning-based vulnerability detection? Yeas! and Nays! In: Proc. of the 45th Int’l Conf. on Software Engineering. Melbourne: IEEE, 2023. 2287–2298. [doi: 10.1109/ICSE48619.2023.00192]

[37] Zhu TF, Lin YP, Liu YH. Synthetic minority oversampling technique for multiclass imbalance problems. Pattern Recognition, 2017, 72: 327–340.

[38] Zhang JL, Sui H, Sun XB, Ge CP, Zhou L, Susilo W. GrabPhisher: Phishing scams detection in Ethereum via temporally evolving GNNs. IEEE Trans. on Services Computing, 2024, 17(6): 3727–3741.

[39] Lee JB, Rossi R, Kong XN. Graph classification using structural attention. In: Proc. of the 24th ACM SIGKDD Int’l Conf. on Knowledge Discovery & Data Mining. London: ACM, 2018. 1666–1674. [doi: 10.1145/3219819.3219980]

[40] Campello RJGB, Moulavi D, Sander J. Density-based clustering based on hierarchical density estimates. In: Proc. of the 17th Pacific-Asia Conf. on Knowledge Discovery and Data Mining. Gold Coast: Springer, 2013. 160–172. [doi: 10.1007/978-3-642-37456-2_14]

[41] Fan JH, Li Y, Wang SH, Nguyen TN. A C/C++ code vulnerability dataset with code changes and CVE summaries. In: Proc. of the 17th Int’l Conf. on Mining Software Repositories. Seoul: ACM, 2020. 508–512. [doi: 10.1145/3379597.3387501]

[42] PyTorch. https://pytorch.org/

[43] Tree-sitter. 2023. https://github.com/tree-sitter/

[44] Deep graph library (DGL). 2023. https://github.com/dmlc/dgl/

[45] CVE-2019-19079. 2023. https://www.cve.org/CVERecord?id=CVE-2019-19079

[46] Wen XC, Chen YP, Gao CY, Zhang HY, Zhang JM, Liao Q. Vulnerability detection with graph simplification and enhanced graph representation learning. In: Proc. of the 45th Int’l Conf. on Software Engineering. Melbourne: IEEE, 2023. 2275–2286. [doi: 10.1109/ICSE48619.2023.00191]

引用本文

曹思聪,孙小兵,薄莉莉,吴潇雪,李斌,陈厅,罗夏朴,张涛,刘维.基于结构感知图神经网络的多类别漏洞检测.软件学报,,():1-17

复制

文章指标

点击次数:746
下载次数: 143
HTML阅读次数: 0
引用次数: 0

历史

收稿日期:2023-07-03
最后修改日期:2023-11-03
录用日期:
在线发布日期: 2025-04-23
出版日期:

微信服务号

微信订阅号

引用本文

相关视频

分享

文章指标

历史

文章二维码

微信服务号

微信订阅号

引用本文

相关视频

分享

微信扫一扫：分享

文章指标

历史

文章二维码