Java程序资源泄露缺陷检测: 传统模型和语言模型的有效性分析
作者:
通讯作者:

计卫星,E-mail:jwx@bnu.edu.cn

中图分类号:

TP311

基金项目:

国家自然科学基金重点项目(62232003)


Detection of Resource Leaks in Java Programs: Effectiveness Analysis of Traditional Models and Language Models
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [48]
  • |
  • 相似文献
  • |
  • 引证文献
  • | |
  • 文章评论
    摘要:

    资源泄露是由于有限的系统资源未能及时正确关闭所导致的缺陷, 广泛存在于各种语言程序中, 且具有一定的隐蔽性. 传统的缺陷检测方法通常基于规则和启发式搜索预测软件中的资源泄露. 近年来, 基于深度学习的缺陷检测方法通过不同的代码表征形式并使用循环神经网络、图神经网络等技术捕获代码中的语义信息. 最近的研究显示, 语言模型在代码理解和生成等任务中表现出色. 然而语言模型针对资源泄露检测这一特定任务上的优势和局限性尚未得到充分评估. 研究基于传统模型、小模型和大模型的检测方法在资源泄露检测任务中的有效性, 并探究小样本学习、微调以及静态分析与大模型结合的多种改进方式. 具体而言, 以JLeaks和DroidLeaks数据集为实验对象, 从资源泄露根本原因、资源种类、代码复杂度等多个维度分析不同模型的表现. 实验结果表明, 微调技术能够显著提升大模型在资源泄露检测领域的检测效果. 然而, 大部分模型在识别第三方库引发的资源泄露上仍需改进. 此外, 代码复杂度对基于传统模型的检测方法的影响更大.

    Abstract:

    Resource leaks, which are defects caused by the failure to timely and properly close the limited system resources, are widely present in programs of various languages and possess a certain degree of concealment. The traditional defect detection methods usually predict the resource leaks in software based on rules and heuristic search. In recent years, defect detection methods based on deep learning have captured the semantic information in the code through different code representation forms and by using techniques such as recurrent neural networks and graph neural networks. Recent studies show that language models have performed outstandingly in tasks such as code understanding and generation. However, the advantages and limitations of large language models (LLMs) in the specific task of resource leak detection have not been fully evaluated. The effectiveness of the detection methods based on traditional models, small models, and LLMs in the task of resource leak detection is studied, and various improvement methods such as few-shot learning, fine-tuning and the combination of static analysis and LLMs are explored. Specifically, taking the JLeaks and DroidLeaks datasets as the experimental objects, the performance of different models is analyzed from multiple dimensions such as the root causes of resource leaks, resource types and code complexity. The experimental results show that the fine-tuning technique can significantly improve the detection effect of LLMs in the field of resource leak detection. However, most models still need to be improved in identifying the resource leaks caused by third-party libraries. In addition, the code complexity has a greater influence on the detection methods based on traditional models for resource leak detection.

    参考文献
    [1] Ghanavati M, Costa D, Seboek J, Lo D, Andrzejak A. Memory and resource leak defects and their repairs in Java projects. Empirical Software Engineering, 2020, 25(1): 678–718.
    [2] Liu TY, Ji WX, Dong XH, Yao WH, Wang YZ, Liu H, Peng HY, Wang YX. JLeaks: A featured resource leak repository collected from hundreds of open-source Java projects. In: Proc. of the 46th IEEE/ACM Int’l Conf. on Software Engineering. Lisbon: IEEE, 2024. 1–13.
    [3] Kellogg M, Shadab N, Sridharan M, Ernst MD. Lightweight and modular resource leak verification. In: Proc. of the 29th ACM Joint Meeting on European Software Engineering Conf. and Symp. on the Foundations of Software Engineering. Athens: ACM, 2021. 181–192. [doi: 10.1145/3468264.3468576]
    [4] Utture A, Palsberg J. From leaks to fixes: Automated repairs for resource leak warnings. In: Proc. of the 31st ACM Joint European Software Engineering Conf. and Symp. on the Foundations of Software Engineering. San Francisco: ACM, 2023: 159–171. [doi: 10.1145/3611643.3616267]
    [5] Wang C, Liu JN, Peng X, Liu Y, Lou YL. Boosting static resource leak detection via LLM-based resource-oriented intention inference. arXiv:2311.04448, 2023.
    [6] Lo D, Nagappan N, Zimmermann T. How practitioners perceive the relevance of software engineering research. In: Proc. of the 10th Joint Meeting on Foundations of Software Engineering. Bergamo: ACM, 2015. 415–425. [doi: 10.1145/2786805.2786809]
    [7] Wang C, Lou YL, Peng X, Liu JN, Zou BH. Mining resource-operation knowledge to support resource leak detection. In: Proc. of the 31st ACM Joint European Software Engineering Conf. and Symp. on the Foundations of Software Engineering. San Francisco: ACM, 2023. 986–998. [doi: 10.1145/3611643.361631]
    [8] Li W, Cai HP, Sui YL, Manz D. PCA: Memory leak detection using partial call-path analysis. In: Proc. of the 28th ACM Joint Meeting on European Software Engineering Conf. and Symp. on the Foundations of Software Engineering. New York: ACM, 2020. 1621–1625. [doi: 10.1145/3368089.3417923]
    [9] PMD source code analyzer. 2002. https://pmd.github.io
    [10] Wang S, Chollak D, Movshovitz-Attias D, Tan L. Bugram: Bug detection with n-gram language models. In: Proc. of the 31st IEEE/ACM Int’l Conf. on Automated Software Engineering. Singapore: IEEE, 2016. 708–719.
    [11] Pradel M, Sen K. DeepBugs: A learning approach to name-based bug detection. Proc. of the ACM on Programming Languages, 2018, 2: 147. [doi: 10.1145/3276517]
    [12] Zhang J, Wang X, Zhang HY, Sun HL, Liu XD, Hu CM, Liu Y. Detecting condition-related bugs with control flow graph neural network. In: Proc. of the 32nd ACM SIGSOFT Int’l Symp. on Software Testing and Analysis. Seattle: ACM, 2023. 1370–1382. [doi: 10.1145/3597926.3598142]
    [13] Zou DQ, Wang SJ, Xu SH, Li Z, Jin H. μVulDeePecker: A deep learning-based system for multiclass vulnerability detection. IEEE Trans. on Dependable and Secure Computing, 2021, 18(5): 2224–2236.
    [14] Zhou YQ, Liu SQ, Siow J, Du XN, Liu Y. Devign: Effective vulnerability identification by learning comprehensive program semantics via graph neural networks. In: Proc. of the 33rd Int’l Conf. on Neural Information Processing Systems. Vancouver: Curran Associates Inc., 2019. 915.
    [15] Nguyen VA, Nguyen DQ, Nguyen V, Le T, Tran QH, Phung D. ReGVD: Revisiting graph neural networks for vulnerability detection. In: Proc. of the 44th IEEE/ACM Int’l Conf. on Software Engineering: Companion Proc. Pittsburgh: IEEE, 2022. 178–182. [doi: 10.1145/3510454.3516865]
    [16] 李韵, 黄辰林, 王中锋, 袁露, 王晓川. 基于机器学习的软件漏洞挖掘方法综述. 软件学报, 2020, 31(7): 2040–2061. http://www.jos.org.cn/1000-9825/6055.htm
    Li Y, Huang CL, Wang ZF, Yuan L, Wang XC. Survey of software vulnerability mining methods based on machine learning. Ruan Jian Xue Bao/Journal of Software, 2020, 31(7): 2040–2061 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/6055.htm
    [17] Fu M, Tantithamthavorn C. LineVul: A Transformer-based line-level vulnerability prediction. In: Proc. of the 19th IEEE/ACM Int’l Conf. on Mining Software Repositories. Pittsburgh: IEEE, 2022. 608–620. [doi: 10.1145/3524842.3528452]
    [18] Liu JY, Ai J, Lu MY, Wang J, Shi HX. Semantic feature learning for software defect prediction from source code and external knowledge. Journal of Systems and Software, 2023, 204: 111753.
    [19] Lu GL, Ju XL, Chen X, Pei WL, Cai ZL. GRACE: Empowering LLM-based software vulnerability detection with graph structure and in-context learning. Journal of Systems and Software, 2024, 212: 112031.
    [20] Chen TY, Li L, Zhu LC, Li ZY, Liu XQ, Liang GT, Wang QX, Xie T. VulLibGen: Generating names of vulnerability-affected packages via a large language model. In: Proc. of the 62nd Annual Meeting of the Association for Computational Linguistics. Bangkok: ACL, 2024. 9767–6780. [doi: 10.18653/v1/2024.acl-long.527]
    [21] Li HN, Hao Y, Zhai YZ, Qian ZY. Enhancing static analysis for practical bug detection: An LLM-integrated approach. Proc. of the ACM on Programming Languages, 2024, 8: 111. [doi: 10.1145/3649828]
    [22] Sun YQ, Wu DY, Xue Y, Liu H, Wang HJ, Xu ZZ, Xie XF, Liu Y. GPTScan: Detecting logic vulnerabilities in smart contracts by combining GPT with program analysis. In: Proc. of the 46th IEEE/ACM Int’l Conf. on Software Engineering. Lisbon: ACM, 2024. 166. [doi: 10.1145/3597503.3639117]
    [23] Yu JX, Liang P, Fu YJ, Tahir A, Shahin M, Wang C, Cai YX. An insight into security code review with LLMs: Capabilities, obstacles and influential factors. arXiv:2401.16310, 2024.
    [24] Sun YQ, Wu DY, Xue Y, Liu H, Ma W, Zhang LY, Liu Y, Li YJ. LLM4Vuln: A unified evaluation framework for decoupling and enhancing LLMs’ vulnerability reasoning. arXiv:2401.16185, 2024.
    [25] Liu YP, Wang J, Wei LL, Xu C, Cheung SC, Wu TY, Yan J, Zhang J. DroidLeaks: A comprehensive database of resource leaks in Android APPs. Empirical Software Engineering, 2019, 24(6): 3435–3483.
    [26] TableInputFormat/TableRecordReaderImpl leaks HTable. 2014. https://github.com/apache/hbase/commit/e04009c9894b1ace20759c5f97f30126f3129aa3
    [27] HBase. TableInputFormat/TableRecordReaderImpl leaks HTable. 2014. https://issues.apache.org/jira/browse/HBASE-10330
    [28] 陈翔, 顾庆, 刘望舒, 刘树龙, 倪超. 静态软件缺陷预测方法研究. 软件学报, 2016, 27(1): 1–25. http://www.jos.org.cn/1000-9825/4923.htm
    Chen X, Gu Q, Liu WS, Liu SL, Ni C. Survey of static software defect prediction. Ruan Jian Xue Bao/Journal of Software, 2016, 27(1): 1–25 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/4923.htm
    [29] Younis A, Malaiya Y, Anderson C, Ray I. To fear or not to fear that is the question: Code characteristics of a vulnerable function with an existing exploit. In: Proc. of the 6th ACM Conf. on Data and Application Security and Privacy. New Orleans: ACM, 2016. 97–104. [doi: 10.1145/2857705.2857750]
    [30] Shin Y, Williams L. Is complexity really the enemy of software security? In: Proc. of the 4th ACM Workshop on Quality of Protection. Alexandria: ACM, 2008. 47–50. [doi: 10.1145/1456362.1456372]
    [31] Li RH, Feng C, Zhang X, Tang CJ. A lightweight assisted vulnerability discovery method using deep neural networks. IEEE Access, 2019, 7: 80079–80092.
    [32] JavaParser. https://javaparser.org/
    [33] ANTLR. https://www.antlr.org/
    [34] Anbiya DR, Purwarianti A, Asnar Y. Vulnerability detection in PHP Web application using lexical analysis approach with machine learning. In: Proc. of the 5th Int’l Conf. on Data and Software Engineering. Mataram: IEEE, 2018. 1–6. [doi: 10.1109/ICODSE.2018.8705809]
    [35] Zhang J, Wang X, Zhang HY, Sun HL, Wang KX, Liu XD. A novel neural source code representation based on abstract syntax tree. In: Proc. of the 41st IEEE/ACM Int’l Conf. on Software Engineering. Montreal: IEEE, 2019. 783–794. [doi: 10.1109/ICSE.2019.00086]
    [36] Yamaguchi F, Golde N, Arp D, Rieck K. Modeling and discovering vulnerabilities with code property graphs. In: Proc. of the 2014 IEEE Symp. on Security and Privacy. Berkeley: IEEE, 2014. 590–604. [doi: 10.1109/SP.2014.44]
    [37] Wang HT, Ye GX, Tang ZY, Tan SH, Huang SF, Fang DY, Feng YS, Bian LZ, Wang Z. Combining graph-based learning with automated data collection for code vulnerability detection. IEEE Trans. on Information Forensics and Security, 2021, 16: 1943–1958.
    [38] Cheng X, Wang HY, Hua JY, Xu GA, Sui YL. DeepWukong: Statically detecting software vulnerabilities using deep graph neural network. ACM Trans. on Software Engineering and Methodology (TOSEM), 2021, 30(3): 38.
    [39] Han K, Xiao A, Wu EH, Guo JY, Xu CJ, Wang YH. Transformer in Transformer. In: Proc. of the 35th Int’l Conf. on Neural Information Processing Systems. Curran Associates Inc., 2021. 1217.
    [40] Ding YRB, Fu YJ, Ibrahim O, Sitawarin C, Chen XY, Alomair B, Wagner D, Ray B, Chen YZ. Vulnerability detection with code language models: How far are we? arXiv:2403.18624, 2024.
    [41] Chen XP, Hu X, Huang Y, Jiang H, Ji WX, Jiang YJ, Jiang YY, Liu B, Liu H, Li XC, Lian XL, Meng GZ, Peng GZ, Peng X, Sun HL, Shi L, Wang B, Wang C, Wang JY, Wang TT, Xuan JF, Xia X, Yang YB, Yang YX, Zhang L, Zhou YM, Zhang L. Deep learning-based software engineering: Progress, challenges, and opportunities. SCIENCE CHINA Information Sciences, 2025, 68(1): 111102.
    [42] LangChain. Announcing LangSmith, a unified platform for debugging, testing, evaluating, and monitoring your LLM applications. 2023. https://blog.langchain.dev/announcing-langsmith/
    [43] Khare A, Dutta S, Li ZY, Solko-Breslin A, Alur R, Naik M. Understanding the effectiveness of large language models in detecting security vulnerabilities. arXiv:2311.16169, 2023.
    [44] GPT-3.5 Turbo fine-tuning and API updates. 2023. https://openai.com/index/gpt-3-5-turbo-fine-tuning-and-api-updates/
    [45] GPT-4 is OpenAI’s most advanced system, producing safer and more useful responses. 2024. https://openai.com/index/gpt-4/
    [46] Gemini Pro. 2024. https://deepmind.google/technologies/gemini/pro/
    相似文献
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

刘天阳,叶嘉威,计卫星,刘辉. Java程序资源泄露缺陷检测: 传统模型和语言模型的有效性分析.软件学报,2025,36(6):2432-2452

复制
相关视频

分享
文章指标
  • 点击次数:137
  • 下载次数: 216
  • HTML阅读次数: 0
  • 引用次数: 0
历史
  • 收稿日期:2024-08-26
  • 最后修改日期:2024-10-14
  • 在线发布日期: 2024-12-10
文章二维码
您是第19922100位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号