[关键词]
[摘要]
面向对象软件度量是理解和保证面向对象软件质量的重要手段之一. 通过将面向对象软件的度量值与其阈值比较, 可简单直观评价其是否有可能包含缺陷. 确定度量阈值方法主要有基于数据分布特征的无监督学习方法和基于缺陷相关性的有监督学习方法. 两类方法各有利弊: 无监督学习方法无需标签信息而易于实现, 但所得阈值的缺陷预测性能通常较差; 有监督学习方法通过机器学习算法提升所得阈值的缺陷预测性能, 但标签信息在实际过程中不易获得且度量与缺陷链接技术复杂. 近年来, 两类方法的研究者不断探索并取得较大进展. 同时, 面向对象软件度量阈值确定方法研究仍存在一些亟待解决的挑战. 对近年来国内外学者在该领域的研究成果进行系统性的总结. 首先, 阐述面向对象软件度量阈值确定方法的研究问题. 其次, 分别从无监督学习方法和有监督学习方法总结相关研究进展, 并梳理具体的理论和实现的技术路径. 然后, 简要介绍面向对象软件度量阈值的其他相关技术. 最后, 总结当前该领域研究过程面临的挑战并给出建议的研究方向.
[Key word]
[Abstract]
Object-oriented software metrics are important for understanding and guaranting the quality of object-oriented software. By comparing object-oriented software metrics with their thresholds, it could be simply and intuitively evaluated whether there is a bug. The methods to deriving metrics thresholds mainly include unsupervised learning methods based on the distribution of metric data and supervised learning methods based on the relationship between the metrics and defect-proneness. The two types of methods have their own advantages and disadvantages: unsupervised methods do not require label information to derive thresholds and are easy to implement, but the resulting thresholds often have a low performance in defect prediction; supervised methods improve the defect prediction performance by machine learning algorithms, but they need label information to derive the thresholds, which is not easy to obtain, and the linking technology between metrics and defect-proneness is complex. In recent years, researchers of the two types of methods have continued to explore and made a great progress. At the same time, it is still challenging to derive the thresholds of object-oriented software metrics. This paper presents the systematic survey on the recent research achievements in deriving metric thresholds. First, the research problem is introduced in object-oriented software metric threshold derivation. Then, the current main research work is described in detail from two aspects: unsupervised and supervised learning methods. After that, the related techniques are discussed. Finally, the opportunities and challenges are summarized in this field and the research directions in the future are outlined.
[中图分类号]
[基金项目]
国家自然科学基金(62172205)