[关键词]
[摘要]
在大型软件项目的开发与维护中,从大量的代码文件中定位软件缺陷费时、费力,有效地进行软件缺陷自动定位,将能极大地降低开发成本.软件缺陷报告通常包含了大量未发觉的软件缺陷的信息,精确地寻找与缺陷报告相关联的代码文件,对于降低维护成本具有重要意义.目前,已有一些基于深度神经网络的缺陷定位技术相对于传统方法,其效果有所提升,但相关工作大多关注网络结构的设计,缺乏对训练过程中损失函数的研究,而损失函数对于预测任务的性能会有极大的影响.在此背景下,提出了代价敏感的间隔分布优化(cost-sensitive margin distribution optimization,简称CSMDO)损失函数,并将代价敏感的间隔分布优化层应用到深度卷积神经网络中,能够良好地处理软件缺陷数据的不平衡性,进一步提高缺陷定位的准确度.
[Key word]
[Abstract]
It is costly to identify bugs from numerous source code files in a large software project. Thus, locating bug automatically and effectively becomes a worthy problem. Bug report is one of the most valuable source of bug description, and precisely locating related source codes linked to the bug reports can help reducing software development cost. Currently, most of the research on bug localization based on deep neural networks focus on design of network structures while lacking attention to the loss function, which impacts the performance significantly in prediction tasks. In this paper, a cost-sensitive margin distribution optimization (CSMDO) loss function is proposed and applied to deep neural networks. This new method is capable of handling the imbalance of software defect data sets, and improves the accuracy significantly.
[中图分类号]
[基金项目]
国家自然科学基金(61422304,61272217)