开源软件缺陷的跨项目相关问题推荐方法
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

TP311

基金项目:

科技创新2030—“新一代人工智能”重大项目(2021ZD0112901); 国家自然科学基金(62177003)


Cross-project Issue Recommendation Method for Open-source Software Defects
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    GitHub是著名的开源软件开发社区, 支持开发人员在开源项目中使用问题追踪系统来处理问题. 在软件缺陷问题的讨论过程中, 开发人员可能指出与该缺陷问题相关的其他项目问题(我们称为跨项目相关问题), 为缺陷问题的修复提供参考信息. 然而, GitHub平台中托管了超过2亿的开源项目和12亿个问题, 导致人工识别和获取跨项目相关问题的工作极其耗时. 提出为缺陷问题自动化推荐跨项目相关问题的方法CPIRecom. 为了构建预选集, 采用项目之间历史相关问题对的数量和问题发布时间间隔筛选问题. 其次, 为了精准推荐, 采用BERT预训练模型提取文本特征, 分析项目特征. 然后使用随机森林算法计算预选问题与缺陷问题的相关概率, 最终根据相关概率排名得到推荐列表. 模拟CPIRecom方法在GitHub平台的使用情况. CPIRecom方法的平均倒数排名达到0.603, 前5项查全率达到0.715.

    Abstract:

    GitHub is a well-known open-source software development community that supports developers using the issue tracking system in each open-source project on GitHub to address issues. During the discussion of an issue about a defect, the developer may point out issues from other projects correlated to the defect, which are called cross-project issues, so as to provide reference information for fixing the defect. However, there are more than 200 million open-source projects and 1.2 billion issues on the GitHub platform, making it time-consuming to identify and acquire cross-project issues manually. This study presents a cross-project issue recommendation method CPIRecom for open-source software defects. This study builds a pre-selection set by filtering issues based on the number of historical issue pairs and the time interval for reporting issues. Then, the study also proposes an accurate recommendation model, which extracts textual features based on the pre-trained model of BERT, analyzes features of projects, calculates the relevant probability between defects and issues from the pre-selection set based on a random forest classifier, and obtains the recommendation list according to the ranking. This study simulates the application of CPIRecom method on GitHub platform. The mean reciprocal rank of CPIRecom method reaches 0.603, and the Recall@5 reaches 0.715 on the simulative test set.

    参考文献
    相似文献
    引证文献
引用本文

刘宝川,张莉,刘桢炜,蒋竞.开源软件缺陷的跨项目相关问题推荐方法.软件学报,,():1-19

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2022-11-03
  • 最后修改日期:2023-01-07
  • 录用日期:
  • 在线发布日期: 2023-10-25
  • 出版日期:
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号