大粒度Pull Request描述自动生成

doi:10.13328/j.cnki.jos.006239

微信服务号

微信订阅号

2025年4月26日 19:17 星期六

首页 > 过刊浏览>2021年第32卷第6期 >1597-1611. DOI:10.13328/j.cnki.jos.006239

PDF HTML阅读 XML下载导出引用引用提醒

大粒度Pull Request描述自动生成
DOI:
                        10.13328/j.cnki.jos.006239
                    
CSTR:
                        
                    
作者:
                        邝砾邝砾
中南大学计算机学院, 湖南 长沙 410083
在期刊界中查找
在百度中查找
在本站中查找
施如意施如意
中南大学计算机学院, 湖南 长沙 410083
在期刊界中查找
在百度中查找
在本站中查找
赵雷浩赵雷浩
中南大学计算机学院, 湖南 长沙 410083
在期刊界中查找
在百度中查找
在本站中查找
张欢张欢
中南大学计算机学院, 湖南 长沙 410083
在期刊界中查找
在百度中查找
在本站中查找
高洪皓高洪皓
上海大学计算机工程与科学学院, 上海 200444
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:邝砾(1982-),女,博士,教授,博士生导师,CCF专业会员,主要研究领域为服务计算,群智软件生态系统,机器学习.
张欢(1996-),女,硕士,主要研究领域为机器学习,数据挖掘,群智软件生态系统.
施如意(1996-),女,硕士,主要研究领域为服务计算,群智软件生态系统.
高洪皓(1985-),男,博士,副教授,CCF高级会员,主要研究领域为软件形式化验证,服务协同计算,无线网络和工业物联网,智能医学影像处理.
赵雷浩(1998-),男,学士,主要研究领域为服务计算.
通讯作者:高洪皓，gaohonghao@shu.edu.cn
中图分类号:
基金项目:国家重点研发计划（2018YFB1003800）；国家自然科学基金（61772560）

Automatic Generation of Large-Granularity Pull Request Description

Author:

KUANG Li
KUANG Li
School of Computer Science and Engineering, Central South University, Changsha 410083, China
在期刊界中查找
在百度中查找
在本站中查找
SHI Ru-Yi
SHI Ru-Yi
School of Computer Science and Engineering, Central South University, Changsha 410083, China
在期刊界中查找
在百度中查找
在本站中查找
ZHAO Lei-Hao
ZHAO Lei-Hao
School of Computer Science and Engineering, Central South University, Changsha 410083, China
在期刊界中查找
在百度中查找
在本站中查找
ZHANG Huan
ZHANG Huan
School of Computer Science and Engineering, Central South University, Changsha 410083, China
在期刊界中查找
在百度中查找
在本站中查找
GAO Hong-Hao
GAO Hong-Hao
School of Computer Engineering and Science, Shanghai University, Shanghai 200444, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

National Key R & D Program of China (2018YFB1003800); National Natural Science Foundation of China (61772560)

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

在GitHub平台中，许多项目贡献者在提交Pull Request（PR）时往往会忽略提交PR描述，这使得提交的PR容易被评审者忽略或者拒绝.因此，自动生成PR描述以帮助项目贡献者提高PR通过率是很有必要的.然而，现有PR描述生成方法的表现会受到PR粒度影响，无法有效为大粒度的PR生成描述.因此，该工作专注于大粒度PR描述的自动生成.首先对PR中的文本信息进行预处理，将文本中的单词作为辅助节点构建词-句异质图，以建立PR语句间的联系；随后对异质图进行特征提取，并将提取后的特征输入至图神经网络进行图表示学习，通过节点间的消息传递，使句子节点学习到更丰富的内容信息；最后，选择带有关键信息的句子组成PR描述.此外，针对PR数据集缺少人工标注的真实标签而无法进行监督学习的问题，使用强化学习指导PR描述的生成，以最小化获得奖励的负期望为目标训练模型，该过程与标签无关，并且直接提升了生成结果的表现.在真实的数据集上进行了实验，实验结果表明，提出的大粒度PR描述生成方法在F1值和可读性上优于现有方法.

关键词:Pull Request描述;异质图神经网络;强化学习;非结构性文档;摘要生成

Abstract:

In GitHub platform, many project contributors often ignore the descriptions of pull requests (PRs) when submitting PRs, making their PRs easily neglected or rejected by reviewers. Therefore, it is necessary to generate PR descriptions automatically to help increase PR pass rate. The performances of existing PR description generation methods are usually affected by PR granularity, so it is difficult to generate descriptions for large-granularity PRs effectively. For such reasons, this work focuses on generating descriptions for large-granularity PRs. The text information is first preprocessed in PR and word-sentence heterogeneous graphs are constructed where the words are used as secondary nodes, so as to establish the connections between PR sentences. Subsequently, feature extraction is performed on the heterogeneous graphs, and then the features are input into graph neural network for further graph representation learning, from which the sentence nodes can learn more abundant content information through message delivery between nodes. Finally, the sentences with key information are selected to form a PR description. In addition, the supervised learning method cannot be used for training due to the lack of manually labeled tags in the dataset, therefore, reinforcement learning is used to guide the generation of PR descriptions. The goal of model training is minimizing the negative expectation of rewards, which does not require the ground truth and directly improves the performance of the results. The experiments are conducted on real dataset and the experimental results show that the proposed method is superior to existing methods in F1 and readability.

Key words:Pull Request description;heterogeneous graph neural network;reinforcement learning;unstructured document;summarization generation

引用本文

邝砾,施如意,赵雷浩,张欢,高洪皓.大粒度Pull Request描述自动生成.软件学报,2021,32(6):1597-1611

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2020-08-09
最后修改日期:2020-10-26
录用日期:
在线发布日期: 2021-02-07
出版日期: 2021-06-06

微信服务号

微信订阅号

引用本文

分享

文章指标

历史

文章二维码

微信服务号

微信订阅号

引用本文

分享

微信扫一扫：分享

文章指标

历史

文章二维码