面向细粒度源代码变更的缺陷预测方法
CSTR:
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:

国家自然科学基金(90718018);国家高技术研究发展计划(863)(2007AA010302)


Bug Prediction Method for Fine-Grained Source Code Changes
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    软件在其生命周期中不断地发生变更,以适应需求和环境的变化.为了及时预测每次变更是否引入了缺陷,研究者们提出了面向软件源代码变更的缺陷预测方法.然而现有方法存在以下3点不足:(1) 仅实现了较粗粒度(事务级和源文件级变更)的预测;(2) 仅采用向量空间模型表征变更,没有充分挖掘蕴藏在软件库中的程序结构、自然语言语义以及历史等信息;(3) 仅探讨较短时间范围内的预测,未考虑在长时间软件演化过程中由于新需求或人员重组等外界因素所带来的概念漂移问题.针对现有的不足,提出一种面向源代码变更的缺陷预测方法.该方法将细粒度(语句级)变更作为预测对象,从而有效降低了质量保证成本;采用程序静态分析和自然语言语义主题推断相结合的技术深入挖掘软件库,从变更的上下文、内容、时间以及人员4个方面构建特征集,从而揭示了变更易于引入缺陷的因素;采用特征熵差值矩阵分析了软件演化过程中概念漂移问题的特点,并通过一种伴随概念回顾的动态窗口学习机制实现了长时间的稳定预测.通过6个著名开源软件验证了该方法的有效性.

    Abstract:

    Software changes constantly in its lifecycle to adapt to the changing requirements and environments. In order to predict whether each change will introduce any bugs timely, various bug prediction methods for software source code changes have been proposed by researchers. However, there are three deficiencies in existing methods: 1) The prediction granularities are limited at the coarse-grained levels (i.e. transaction or file levels); 2) As vector space model is used to represent software changes, abundant information in software repositories, such as program structure, natural language semantic and history information, can not be mined sufficiently; 3) Only short-time prediction is explored without considering the concept drift caused by new requirements, team restructuring or other external factors during the long time software evolution process. In order to overcome the shortcomings of existing methods, a bug prediction method for source code changes is proposed. It makes prediction for fine-grained (i.e. statement level) changes, which reduces the quality assurance cost effectively. By in-depth mining software repositories with static program analysis and natural language semantic inference technologies, feature sets of changes are constructed in four aspects (i.e. context, content, time, and developer) and key factors that lead to bug injection are revealed. Characteristics of concept drift in software evolution process are analyzed by using matrix of feature entropy difference, and an algorithm of adaptive window with concept reviewing is proposed to achieve stability of long-time prediction. Experiments on six famous open source projects demonstrate effectiveness of the proposed method.

    参考文献
    相似文献
    引证文献
引用本文

原子,于莉莉,刘超.面向细粒度源代码变更的缺陷预测方法.软件学报,2014,25(11):2499-2517

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2013-10-28
  • 最后修改日期:2013-12-26
  • 录用日期:
  • 在线发布日期: 2014-11-05
  • 出版日期:
文章二维码
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号