Java程序资源泄露缺陷检测: 传统模型和语言模型的有效性分析

doi:10.13328/j.cnki.jos.007327

微信服务号

微信订阅号

2025年5月11日 5:04 星期日

首页 > 过刊浏览>2025年第36卷第6期 >2432-2452. DOI:10.13328/j.cnki.jos.007327

PDF HTML阅读 XML下载导出引用引用提醒

Java程序资源泄露缺陷检测: 传统模型和语言模型的有效性分析
DOI:
                        10.13328/j.cnki.jos.007327
                    
CSTR:
                        
                    
作者:
                        刘天阳刘天阳
北京理工大学 计算机学院, 北京 100081
在期刊界中查找
在百度中查找
在本站中查找
叶嘉威叶嘉威
北京理工大学 计算机学院, 北京 100081
在期刊界中查找
在百度中查找
在本站中查找
计卫星计卫星
北京师范大学 人工智能学院, 北京 100875
在期刊界中查找
在百度中查找
在本站中查找
刘辉刘辉
北京理工大学 计算机学院, 北京 100081
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:
通讯作者:计卫星,E-mail:jwx@bnu.edu.cn
中图分类号:TP311
基金项目:国家自然科学基金重点项目(62232003)

Detection of Resource Leaks in Java Programs: Effectiveness Analysis of Traditional Models and Language Models

Author:

LIU Tian-Yang
LIU Tian-Yang
School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
在期刊界中查找
在百度中查找
在本站中查找
YE Jia-Wei
YE Jia-Wei
School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
在期刊界中查找
在百度中查找
在本站中查找
JI Wei-Xing
JI Wei-Xing
School of Artificial Intelligence, Beijing Normal University, Beijing 100875, China
在期刊界中查找
在百度中查找
在本站中查找
LIU Hui
LIU Hui
School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献 [48]

相似文献

引证文献

资源附件

文章评论

摘要:

资源泄露是由于有限的系统资源未能及时正确关闭所导致的缺陷, 广泛存在于各种语言程序中, 且具有一定的隐蔽性. 传统的缺陷检测方法通常基于规则和启发式搜索预测软件中的资源泄露. 近年来, 基于深度学习的缺陷检测方法通过不同的代码表征形式并使用循环神经网络、图神经网络等技术捕获代码中的语义信息. 最近的研究显示, 语言模型在代码理解和生成等任务中表现出色. 然而语言模型针对资源泄露检测这一特定任务上的优势和局限性尚未得到充分评估. 研究基于传统模型、小模型和大模型的检测方法在资源泄露检测任务中的有效性, 并探究小样本学习、微调以及静态分析与大模型结合的多种改进方式. 具体而言, 以JLeaks和DroidLeaks数据集为实验对象, 从资源泄露根本原因、资源种类、代码复杂度等多个维度分析不同模型的表现. 实验结果表明, 微调技术能够显著提升大模型在资源泄露检测领域的检测效果. 然而, 大部分模型在识别第三方库引发的资源泄露上仍需改进. 此外, 代码复杂度对基于传统模型的检测方法的影响更大.

关键词:资源泄露;缺陷检测;实证研究;大模型

Abstract:

Resource leaks, which are defects caused by the failure to timely and properly close the limited system resources, are widely present in programs of various languages and possess a certain degree of concealment. The traditional defect detection methods usually predict the resource leaks in software based on rules and heuristic search. In recent years, defect detection methods based on deep learning have captured the semantic information in the code through different code representation forms and by using techniques such as recurrent neural networks and graph neural networks. Recent studies show that language models have performed outstandingly in tasks such as code understanding and generation. However, the advantages and limitations of large language models (LLMs) in the specific task of resource leak detection have not been fully evaluated. The effectiveness of the detection methods based on traditional models, small models, and LLMs in the task of resource leak detection is studied, and various improvement methods such as few-shot learning, fine-tuning and the combination of static analysis and LLMs are explored. Specifically, taking the JLeaks and DroidLeaks datasets as the experimental objects, the performance of different models is analyzed from multiple dimensions such as the root causes of resource leaks, resource types and code complexity. The experimental results show that the fine-tuning technique can significantly improve the detection effect of LLMs in the field of resource leak detection. However, most models still need to be improved in identifying the resource leaks caused by third-party libraries. In addition, the code complexity has a greater influence on the detection methods based on traditional models for resource leak detection.

Key words:resource leak;defect detection;empirical study;large language model (LLM)

参考文献

[1] Ghanavati M, Costa D, Seboek J, Lo D, Andrzejak A. Memory and resource leak defects and their repairs in Java projects. Empirical Software Engineering, 2020, 25(1): 678–718.

[2] Liu TY, Ji WX, Dong XH, Yao WH, Wang YZ, Liu H, Peng HY, Wang YX. JLeaks: A featured resource leak repository collected from hundreds of open-source Java projects. In: Proc. of the 46th IEEE/ACM Int’l Conf. on Software Engineering. Lisbon: IEEE, 2024. 1–13.

[3] Kellogg M, Shadab N, Sridharan M, Ernst MD. Lightweight and modular resource leak verification. In: Proc. of the 29th ACM Joint Meeting on European Software Engineering Conf. and Symp. on the Foundations of Software Engineering. Athens: ACM, 2021. 181–192. [doi: 10.1145/3468264.3468576]

[4] Utture A, Palsberg J. From leaks to fixes: Automated repairs for resource leak warnings. In: Proc. of the 31st ACM Joint European Software Engineering Conf. and Symp. on the Foundations of Software Engineering. San Francisco: ACM, 2023: 159–171. [doi: 10.1145/3611643.3616267]

[5] Wang C, Liu JN, Peng X, Liu Y, Lou YL. Boosting static resource leak detection via LLM-based resource-oriented intention inference. arXiv:2311.04448, 2023.

[6] Lo D, Nagappan N, Zimmermann T. How practitioners perceive the relevance of software engineering research. In: Proc. of the 10th Joint Meeting on Foundations of Software Engineering. Bergamo: ACM, 2015. 415–425. [doi: 10.1145/2786805.2786809]

[7] Wang C, Lou YL, Peng X, Liu JN, Zou BH. Mining resource-operation knowledge to support resource leak detection. In: Proc. of the 31st ACM Joint European Software Engineering Conf. and Symp. on the Foundations of Software Engineering. San Francisco: ACM, 2023. 986–998. [doi: 10.1145/3611643.361631]

[8] Li W, Cai HP, Sui YL, Manz D. PCA: Memory leak detection using partial call-path analysis. In: Proc. of the 28th ACM Joint Meeting on European Software Engineering Conf. and Symp. on the Foundations of Software Engineering. New York: ACM, 2020. 1621–1625. [doi: 10.1145/3368089.3417923]

[9] PMD source code analyzer. 2002. https://pmd.github.io

[10] Wang S, Chollak D, Movshovitz-Attias D, Tan L. Bugram: Bug detection with n-gram language models. In: Proc. of the 31st IEEE/ACM Int’l Conf. on Automated Software Engineering. Singapore: IEEE, 2016. 708–719.

[11] Pradel M, Sen K. DeepBugs: A learning approach to name-based bug detection. Proc. of the ACM on Programming Languages, 2018, 2: 147. [doi: 10.1145/3276517]

[12] Zhang J, Wang X, Zhang HY, Sun HL, Liu XD, Hu CM, Liu Y. Detecting condition-related bugs with control flow graph neural network. In: Proc. of the 32nd ACM SIGSOFT Int’l Symp. on Software Testing and Analysis. Seattle: ACM, 2023. 1370–1382. [doi: 10.1145/3597926.3598142]

[13] Zou DQ, Wang SJ, Xu SH, Li Z, Jin H. μVulDeePecker: A deep learning-based system for multiclass vulnerability detection. IEEE Trans. on Dependable and Secure Computing, 2021, 18(5): 2224–2236.

[14] Zhou YQ, Liu SQ, Siow J, Du XN, Liu Y. Devign: Effective vulnerability identification by learning comprehensive program semantics via graph neural networks. In: Proc. of the 33rd Int’l Conf. on Neural Information Processing Systems. Vancouver: Curran Associates Inc., 2019. 915.

[15] Nguyen VA, Nguyen DQ, Nguyen V, Le T, Tran QH, Phung D. ReGVD: Revisiting graph neural networks for vulnerability detection. In: Proc. of the 44th IEEE/ACM Int’l Conf. on Software Engineering: Companion Proc. Pittsburgh: IEEE, 2022. 178–182. [doi: 10.1145/3510454.3516865]

[16] 李韵, 黄辰林, 王中锋, 袁露, 王晓川. 基于机器学习的软件漏洞挖掘方法综述. 软件学报, 2020, 31(7): 2040–2061. http://www.jos.org.cn/1000-9825/6055.htm

Li Y, Huang CL, Wang ZF, Yuan L, Wang XC. Survey of software vulnerability mining methods based on machine learning. Ruan Jian Xue Bao/Journal of Software, 2020, 31(7): 2040–2061 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/6055.htm

[17] Fu M, Tantithamthavorn C. LineVul: A Transformer-based line-level vulnerability prediction. In: Proc. of the 19th IEEE/ACM Int’l Conf. on Mining Software Repositories. Pittsburgh: IEEE, 2022. 608–620. [doi: 10.1145/3524842.3528452]

[18] Liu JY, Ai J, Lu MY, Wang J, Shi HX. Semantic feature learning for software defect prediction from source code and external knowledge. Journal of Systems and Software, 2023, 204: 111753.

[19] Lu GL, Ju XL, Chen X, Pei WL, Cai ZL. GRACE: Empowering LLM-based software vulnerability detection with graph structure and in-context learning. Journal of Systems and Software, 2024, 212: 112031.

[20] Chen TY, Li L, Zhu LC, Li ZY, Liu XQ, Liang GT, Wang QX, Xie T. VulLibGen: Generating names of vulnerability-affected packages via a large language model. In: Proc. of the 62nd Annual Meeting of the Association for Computational Linguistics. Bangkok: ACL, 2024. 9767–6780. [doi: 10.18653/v1/2024.acl-long.527]

[21] Li HN, Hao Y, Zhai YZ, Qian ZY. Enhancing static analysis for practical bug detection: An LLM-integrated approach. Proc. of the ACM on Programming Languages, 2024, 8: 111. [doi: 10.1145/3649828]

[22] Sun YQ, Wu DY, Xue Y, Liu H, Wang HJ, Xu ZZ, Xie XF, Liu Y. GPTScan: Detecting logic vulnerabilities in smart contracts by combining GPT with program analysis. In: Proc. of the 46th IEEE/ACM Int’l Conf. on Software Engineering. Lisbon: ACM, 2024. 166. [doi: 10.1145/3597503.3639117]

[23] Yu JX, Liang P, Fu YJ, Tahir A, Shahin M, Wang C, Cai YX. An insight into security code review with LLMs: Capabilities, obstacles and influential factors. arXiv:2401.16310, 2024.

[24] Sun YQ, Wu DY, Xue Y, Liu H, Ma W, Zhang LY, Liu Y, Li YJ. LLM4Vuln: A unified evaluation framework for decoupling and enhancing LLMs’ vulnerability reasoning. arXiv:2401.16185, 2024.

[25] Liu YP, Wang J, Wei LL, Xu C, Cheung SC, Wu TY, Yan J, Zhang J. DroidLeaks: A comprehensive database of resource leaks in Android APPs. Empirical Software Engineering, 2019, 24(6): 3435–3483.

[26] TableInputFormat/TableRecordReaderImpl leaks HTable. 2014. https://github.com/apache/hbase/commit/e04009c9894b1ace20759c5f97f30126f3129aa3

[27] HBase. TableInputFormat/TableRecordReaderImpl leaks HTable. 2014. https://issues.apache.org/jira/browse/HBASE-10330

[28] 陈翔, 顾庆, 刘望舒, 刘树龙, 倪超. 静态软件缺陷预测方法研究. 软件学报, 2016, 27(1): 1–25. http://www.jos.org.cn/1000-9825/4923.htm

Chen X, Gu Q, Liu WS, Liu SL, Ni C. Survey of static software defect prediction. Ruan Jian Xue Bao/Journal of Software, 2016, 27(1): 1–25 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/4923.htm

[29] Younis A, Malaiya Y, Anderson C, Ray I. To fear or not to fear that is the question: Code characteristics of a vulnerable function with an existing exploit. In: Proc. of the 6th ACM Conf. on Data and Application Security and Privacy. New Orleans: ACM, 2016. 97–104. [doi: 10.1145/2857705.2857750]

[30] Shin Y, Williams L. Is complexity really the enemy of software security? In: Proc. of the 4th ACM Workshop on Quality of Protection. Alexandria: ACM, 2008. 47–50. [doi: 10.1145/1456362.1456372]

[31] Li RH, Feng C, Zhang X, Tang CJ. A lightweight assisted vulnerability discovery method using deep neural networks. IEEE Access, 2019, 7: 80079–80092.

[32] JavaParser. https://javaparser.org/

[33] ANTLR. https://www.antlr.org/

[34] Anbiya DR, Purwarianti A, Asnar Y. Vulnerability detection in PHP Web application using lexical analysis approach with machine learning. In: Proc. of the 5th Int’l Conf. on Data and Software Engineering. Mataram: IEEE, 2018. 1–6. [doi: 10.1109/ICODSE.2018.8705809]

[35] Zhang J, Wang X, Zhang HY, Sun HL, Wang KX, Liu XD. A novel neural source code representation based on abstract syntax tree. In: Proc. of the 41st IEEE/ACM Int’l Conf. on Software Engineering. Montreal: IEEE, 2019. 783–794. [doi: 10.1109/ICSE.2019.00086]

[36] Yamaguchi F, Golde N, Arp D, Rieck K. Modeling and discovering vulnerabilities with code property graphs. In: Proc. of the 2014 IEEE Symp. on Security and Privacy. Berkeley: IEEE, 2014. 590–604. [doi: 10.1109/SP.2014.44]

[37] Wang HT, Ye GX, Tang ZY, Tan SH, Huang SF, Fang DY, Feng YS, Bian LZ, Wang Z. Combining graph-based learning with automated data collection for code vulnerability detection. IEEE Trans. on Information Forensics and Security, 2021, 16: 1943–1958.

[38] Cheng X, Wang HY, Hua JY, Xu GA, Sui YL. DeepWukong: Statically detecting software vulnerabilities using deep graph neural network. ACM Trans. on Software Engineering and Methodology (TOSEM), 2021, 30(3): 38.

[39] Han K, Xiao A, Wu EH, Guo JY, Xu CJ, Wang YH. Transformer in Transformer. In: Proc. of the 35th Int’l Conf. on Neural Information Processing Systems. Curran Associates Inc., 2021. 1217.

[40] Ding YRB, Fu YJ, Ibrahim O, Sitawarin C, Chen XY, Alomair B, Wagner D, Ray B, Chen YZ. Vulnerability detection with code language models: How far are we? arXiv:2403.18624, 2024.

[41] Chen XP, Hu X, Huang Y, Jiang H, Ji WX, Jiang YJ, Jiang YY, Liu B, Liu H, Li XC, Lian XL, Meng GZ, Peng GZ, Peng X, Sun HL, Shi L, Wang B, Wang C, Wang JY, Wang TT, Xuan JF, Xia X, Yang YB, Yang YX, Zhang L, Zhou YM, Zhang L. Deep learning-based software engineering: Progress, challenges, and opportunities. SCIENCE CHINA Information Sciences, 2025, 68(1): 111102.

[42] LangChain. Announcing LangSmith, a unified platform for debugging, testing, evaluating, and monitoring your LLM applications. 2023. https://blog.langchain.dev/announcing-langsmith/

[43] Khare A, Dutta S, Li ZY, Solko-Breslin A, Alur R, Naik M. Understanding the effectiveness of large language models in detecting security vulnerabilities. arXiv:2311.16169, 2023.

[44] GPT-3.5 Turbo fine-tuning and API updates. 2023. https://openai.com/index/gpt-3-5-turbo-fine-tuning-and-api-updates/

[45] GPT-4 is OpenAI’s most advanced system, producing safer and more useful responses. 2024. https://openai.com/index/gpt-4/

[46] Gemini Pro. 2024. https://deepmind.google/technologies/gemini/pro/

引用本文

刘天阳,叶嘉威,计卫星,刘辉. Java程序资源泄露缺陷检测: 传统模型和语言模型的有效性分析.软件学报,2025,36(6):2432-2452

复制

文章指标

点击次数:137
下载次数: 216
HTML阅读次数: 0
引用次数: 0

历史

收稿日期:2024-08-26
最后修改日期:2024-10-14
录用日期:
在线发布日期: 2024-12-10
出版日期:

微信服务号

微信订阅号

引用本文

相关视频

分享

文章指标

历史

文章二维码

微信服务号

微信订阅号

引用本文

相关视频

分享

微信扫一扫：分享

文章指标

历史

文章二维码