基于知识图谱的跨项目安全缺陷报告预测方法
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

TP311

基金项目:

国家自然科学基金(62202414,62141208);国家重点研发计划(2020YFC0833105Z1)


Cross-project Prediction Method of Security Bug Reports Based on Knowledge Graph
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    安全缺陷报告可以描述软件产品中的安全关键漏洞.为了消除软件产品的安全攻击风险,安全缺陷报告(security bug report,SBR)预测越来越受到研究人员的关注.但在实际软件开发场景中,需要进行软件安全漏洞预测的项目可能是来自新公司或属于新启动的项目,没有足够的已标记安全缺陷报告供在实践中构建此软件安全漏洞预测模型.一种简单的解决方案就是使用迁移模型,即利用其他项目已经标记过的数据来构建预测模型.受到该领域最近的两项研究工作的启发,以安全关键字过滤为思路提出一种融合知识图谱的跨项目安全缺陷报告预测方法KG-SBRP (knowledge graph of security bug report prediction).使用安全缺陷报告中的文本信息域结合CWE (common weakness enumeration)与CVE Details (common vulnerabilities and exposures)共同构建三元组规则实体,以三元组规则实体构建安全漏洞知识图谱,在图谱中结合实体及其关系识别安全缺陷报告.将数据分为训练集和测试集进行模型拟合和性能评估.所构建的模型在7个不同规模的安全缺陷报告数据集上展开实证研究,研究结果表明,所提方法与当前主流方法FARSEC和Keyword matrix相比,在跨项目安全缺陷报告预测场景下,性能指标F1-score值可以平均提高11%,除此之外,在项目内安全缺陷报告预测场景下,F1-score值同样可以平均提高30%.

    Abstract:

    Security bug reports (SBRs) can describe critical security vulnerabilities in software products. SBR prediction has attracted the increasing attention of researchers to eliminate security attack risks of software products. However, in actual software development scenarios, a new company or new project may need software security bug prediction, without enough marked SBRs for building SBR prediction models in practice. A simple solution is employing the migration model, which means that marked data of other projects can be adopted to build the prediction model. Inspired by two recent studies in this field, this study puts forward a cross-project SBR prediction method integrating knowledge graphs, i.e., knowledge graph of security bug report prediction (KG-SBRP), based on the idea of security keyword filtering. The text information field in SBR is combined with common weakness enumeration (CWE) and common vulnerabilities and exposures (CVE) Details to build a triple rule entity. Then the entity is utilized to build a knowledge graph of security bugs and identify SBRs by combining the entity and relationship recognition. Finally, the data is divided into training sets and test sets for model fitting and performance evaluation. The built model conducts empirical research on seven SBR datasets with different scales. The results show that compared with the current main methods FARSEC and Keyword matrix, the proposed method can increase the performance index F1-score by an average of 11% under cross-project SBR prediction scenarios. In addition, the F1-score value can also grow by an average of 30% in SBR prediction scenarios within a project.

    参考文献
    相似文献
    引证文献
引用本文

郑炜,刘程远,吴潇雪,陈翔,成婧源,孙小兵,孙瑞阳.基于知识图谱的跨项目安全缺陷报告预测方法.软件学报,2024,35(3):1257-1279

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2022-01-06
  • 最后修改日期:2022-06-26
  • 录用日期:
  • 在线发布日期: 2023-07-05
  • 出版日期:
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号