Java程序资源泄露缺陷检测: 传统模型和语言模型的有效性分析

doi:10.13328/j.cnki.jos.007327

微信服务号

微信订阅号

2025年4月26日 18:56 星期六

首页 > 过刊浏览>2025年第36卷第6期 >2432-2452. DOI:10.13328/j.cnki.jos.007327

PDF HTML阅读 XML下载导出引用引用提醒

Java程序资源泄露缺陷检测: 传统模型和语言模型的有效性分析
DOI:
                        10.13328/j.cnki.jos.007327
                    
CSTR:
                        
                    
作者:
                        刘天阳刘天阳
北京理工大学 计算机学院, 北京 100081
在期刊界中查找
在百度中查找
在本站中查找
叶嘉威叶嘉威
北京理工大学 计算机学院, 北京 100081
在期刊界中查找
在百度中查找
在本站中查找
计卫星计卫星
北京师范大学 人工智能学院, 北京 100875
在期刊界中查找
在百度中查找
在本站中查找
刘辉刘辉
北京理工大学 计算机学院, 北京 100081
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:
通讯作者:计卫星,E-mail:jwx@bnu.edu.cn
中图分类号:TP311
基金项目:国家自然科学基金重点项目(62232003)

Detection of Resource Leaks in Java Programs: Effectiveness Analysis of Traditional Models and Language Models

Author:

LIU Tian-Yang
LIU Tian-Yang
School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
在期刊界中查找
在百度中查找
在本站中查找
YE Jia-Wei
YE Jia-Wei
School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
在期刊界中查找
在百度中查找
在本站中查找
JI Wei-Xing
JI Wei-Xing
School of Artificial Intelligence, Beijing Normal University, Beijing 100875, China
在期刊界中查找
在百度中查找
在本站中查找
LIU Hui
LIU Hui
School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

资源泄露是由于有限的系统资源未能及时正确关闭所导致的缺陷, 广泛存在于各种语言程序中, 且具有一定的隐蔽性. 传统的缺陷检测方法通常基于规则和启发式搜索预测软件中的资源泄露. 近年来, 基于深度学习的缺陷检测方法通过不同的代码表征形式并使用循环神经网络、图神经网络等技术捕获代码中的语义信息. 最近的研究显示, 语言模型在代码理解和生成等任务中表现出色. 然而语言模型针对资源泄露检测这一特定任务上的优势和局限性尚未得到充分评估. 研究基于传统模型、小模型和大模型的检测方法在资源泄露检测任务中的有效性, 并探究小样本学习、微调以及静态分析与大模型结合的多种改进方式. 具体而言, 以JLeaks和DroidLeaks数据集为实验对象, 从资源泄露根本原因、资源种类、代码复杂度等多个维度分析不同模型的表现. 实验结果表明, 微调技术能够显著提升大模型在资源泄露检测领域的检测效果. 然而, 大部分模型在识别第三方库引发的资源泄露上仍需改进. 此外, 代码复杂度对基于传统模型的检测方法的影响更大.

关键词:资源泄露;缺陷检测;实证研究;大模型

Abstract:

Resource leaks, which are defects caused by the failure to timely and properly close the limited system resources, are widely present in programs of various languages and possess a certain degree of concealment. The traditional defect detection methods usually predict the resource leaks in software based on rules and heuristic search. In recent years, defect detection methods based on deep learning have captured the semantic information in the code through different code representation forms and by using techniques such as recurrent neural networks and graph neural networks. Recent studies show that language models have performed outstandingly in tasks such as code understanding and generation. However, the advantages and limitations of large language models (LLMs) in the specific task of resource leak detection have not been fully evaluated. The effectiveness of the detection methods based on traditional models, small models, and LLMs in the task of resource leak detection is studied, and various improvement methods such as few-shot learning, fine-tuning and the combination of static analysis and LLMs are explored. Specifically, taking the JLeaks and DroidLeaks datasets as the experimental objects, the performance of different models is analyzed from multiple dimensions such as the root causes of resource leaks, resource types and code complexity. The experimental results show that the fine-tuning technique can significantly improve the detection effect of LLMs in the field of resource leak detection. However, most models still need to be improved in identifying the resource leaks caused by third-party libraries. In addition, the code complexity has a greater influence on the detection methods based on traditional models for resource leak detection.

Key words:resource leak;defect detection;empirical study;large language model (LLM)

引用本文

刘天阳,叶嘉威,计卫星,刘辉. Java程序资源泄露缺陷检测: 传统模型和语言模型的有效性分析.软件学报,2025,36(6):2432-2452

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2024-08-26
最后修改日期:2024-10-14
录用日期:
在线发布日期: 2024-12-10
出版日期:

微信服务号

微信订阅号

引用本文

分享

文章指标

历史

文章二维码

微信服务号

微信订阅号

引用本文

分享

微信扫一扫：分享

文章指标

历史

文章二维码