国家自然科学基金(62102014, 62177003); 软件开发环境国家重点实验室基金(SKLSDE-2021ZX-10)
大型复杂软件系统的需求分析与生成是一个由上而下逐层分解的过程, 跨层需求间追踪关系的构建对于项目的管理、开发与演化都至关重要. 开源系统的松耦合贡献方式需要每位参与者能便捷地理解需求的来龙去脉及需求状态, 这依赖跨层需求间的追踪. 问题描述日志是开源系统中需求的常见呈现方式, 其无固定模板要求, 内容多样(含文本、代码、调试信息等), 术语使用自由, 跨层需求间抽象层次大, 给自动追踪带来极大挑战. 提出一种面向关键特征维度的相关性反馈方法, 通过静态分析项目代码结构, 抽取代码相关术语及其间的关联强度, 构建代码词汇库, 以缓解跨层需求的抽象层次差距及用语不统一的问题; 通过度量词汇对需求描述的重要性并基于此筛选关键特征维度, 以对查询语句进行针对性的优化, 有效减少需求描述长度、内容形式等方面的噪音. 通过在3个开源系统需求集上针对两个场景的实验, 表明所提方法相比基线方法在跨层需求追踪方面的优越性, 相比VSM、Standard Rocchio和Trace BERT, F2值提升分别达到29.01%、7.75%和59.21%.
In large-scale and complex software systems, requirement analysis and generation are accomplished through a top-down process, and the construction of tracking relationships between cross-level requirements is very important for project management, development, and evolution. The loosely-coupled contribution approach of open-source systems requires each participant to easily understand the context and state of the requirements, which relies on cross-level requirement tracking. The issue description log is a common way of presenting requirements in open-source systems. It has no fixed template, and its content is diverse (including text, code, and debugging information). Furthermore, the terms can be freely used, and the gap in abstraction level between cross-level requirements is large, which brings great challenges to automatic tracking. In this paper, a correlation feedback method for key feature dimensions is proposed. Through static analysis of the project’s code structure, code-related terms and their correlation strength are extracted, and a code vocabulary base is constructed to alleviate the gap in abstraction level and the inconsistency of terminology between cross-level requirements. By measuring the importance of terms to requirement description and screening key feature dimensions on this basis, the inquiry statement is optimized to effectively reduce the noise of requirement description length, content form, and other aspects. Experiments with two scenarios on three open-source systems suggest that the proposed method outperforms baseline approaches in cross-level requirement tracking and improves F2 value to 29.01%, 7.75.1%, and 59,21% compared with vector space model (VSM), standard Rocchio, and trace bidirectional encoder representations from transformers (BERT), respectively.