TensorFlow开源软件社区中贡献修订的实证研究

doi:10.13328/j.cnki.jos.006873

微信服务号

微信订阅号

2025年7月18日 2:41 星期五

首页 > 过刊浏览>2023年第34卷第9期 >4056-4068. DOI:10.13328/j.cnki.jos.006873

PDF HTML阅读 XML下载导出引用引用提醒

TensorFlow开源软件社区中贡献修订的实证研究
DOI:
                        10.13328/j.cnki.jos.006873
                    
CSTR:
                        
                    
作者:
                        李志星李志星
国防科技大学 计算机学院, 湖南 长沙 410073
在期刊界中查找
在百度中查找
在本站中查找
余跃余跃
国防科技大学 计算机学院, 湖南 长沙 410073
在期刊界中查找
在百度中查找
在本站中查找
王涛王涛
国防科技大学 计算机学院, 湖南 长沙 410073
在期刊界中查找
在百度中查找
在本站中查找
蔡孟栾蔡孟栾
国防科技大学 计算机学院, 湖南 长沙 410073
在期刊界中查找
在百度中查找
在本站中查找
王怀民王怀民
国防科技大学 计算机学院, 湖南 长沙 410073
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:李志星(1992-),男,博士,助理研究员,主要研究领域为群体协同,开源软件,实证软件工程;余跃(1988-),男,博士,副研究员,CCF高级会员,主要研究领域为实证软件工程,群体化开发,开源软件生态;王涛(1984-),男,博士,副研究员,CCF高级会员,主要研究领域为软件仓库挖掘,基于群体智能的软件工程,开源软件生态;蔡孟栾(1998-),男,硕士生,CCF学生会员,主要研究领域为智能化软件工程,数据挖掘;王怀民(1962-),男,博士,教授,CCF会士,主要研究领域为可信计算,群体智能,分布式计算.
通讯作者:余跃,E-mail:yuyue@nudt.edu.cn
中图分类号:
基金项目:科技创新2030—“新一代人工智能”重大项目(2021ZD0112900); 国家自然科学基金(62141209)

Empirical Study on Pull-request Revisions in Open Source Software Community of TensorFlow

Author:

LI Zhi-Xing
LI Zhi-Xing
College of Computer, National University of Defense Technology, Changsha 410073, China
在期刊界中查找
在百度中查找
在本站中查找
YU Yue
YU Yue
College of Computer, National University of Defense Technology, Changsha 410073, China
在期刊界中查找
在百度中查找
在本站中查找
WANG Tao
WANG Tao
College of Computer, National University of Defense Technology, Changsha 410073, China
在期刊界中查找
在百度中查找
在本站中查找
CAI Meng-Luan
CAI Meng-Luan
College of Computer, National University of Defense Technology, Changsha 410073, China
在期刊界中查找
在百度中查找
在本站中查找
WANG Huai-Min
WANG Huai-Min
College of Computer, National University of Defense Technology, Changsha 410073, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

人工智能(artificial intelligence, AI)的飞速发展得益于开源社区的开放协同, 大量的开发者通过提交PR (pull-request)为AI开源软件做贡献. 然而, 外部贡献者所提交的PR质量参差不齐, 开源项目管理团队需要对PR进行代码审查, 并要求贡献者根据审查意见对PR进行修订. PR的修订过程对AI开源软件的质量有着重要的影响, 因此对该过程进行更加全面、深入的实证研究很有必要. 首先, 从TensorFlow开源软件社区中收集一组PR的修订历史, 通过对PR的代码提交信息以及审查评论进行定性分析, 归纳总结PR修订类型的分类体系. 其次, 根据此分类体系人工标注一组修订数据集, 并基于此数据集定量分析不同修订类型的频率分布、次序分布以及关联关系. 研究结果表明: TensorFlow开源社区中的PR存在3大类共11种不同类型的修订, 其中完善类修订出现的频率最高; 此外, 相比于其他类修订和完善类修订, 修正类修订更常发生在PR的早期更新中; 与结构相关的修订更有可能与其他类型的修订同现或邻现, 配置修订以及变基修订有较大概率会接连出现. 实证研究结果可帮助AI开源实践者和研究者更好地理解PR的修订过程, 特别是有助于引导PR的审查和修订行为、提高开源群体协同效率.

关键词:人工智能;开源软件;代码贡献;代码审查;代码修订

Abstract:

The recent boom in artificial intelligence (AI) benefits from the open collaboration of the open source software (OSS) community. An increasing number of OSS developers are contributing to AI projects by submitting pull requests (PRs). However, the PR quality submitted by external contributors varies, and the AI project management teams have to review PRs and ask contributors to revise them if necessary. Since the revision exerts a direct impact on the review efficiency and acceptance of PRs, it is important to achieve a better understanding of PR revisions. This study conducts an empirical study based on a set of PRs and their revision histories collected from the TensorFlow project. It first manually analyzes a sample of commit messages, reviews PR comments, and constructs a taxonomy of revision types. Then, according to the defined taxonomy, a large set of PR revisions are manually labeled. Based on the dataset, the frequency and order of each type of revision are explored. Additionally, this study also investigates the frequency distribution, order distribution, and correlation relationship between different types of revisions. The empirical findings show that there are 11 different types of revisions which can be classified into three categories. Evolvability revisions occur more frequently than other revision types, and functional revisions are more likely to occur in the early PR updates than evolvability revisions and other types of revisions. Structure-related revisions have a high chance to co-occur or adj-occur with other revisions. Configuration-related revisions or rebasing revisions are more likely to appear in succession. The empirical results can help AI open source practitioners and researchers better understand the PR revision process, especially guide the review and revision behaviors of PRs and improve the collaborative efficiency of open source groups.

Key words:artificial intelligence (AI);open source software (OSS);pull-request (PR);code review;revision

引用本文

李志星,余跃,王涛,蔡孟栾,王怀民. TensorFlow开源软件社区中贡献修订的实证研究.软件学报,2023,34(9):4056-4068

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2022-09-04
最后修改日期:2022-10-13
录用日期:
在线发布日期: 2023-01-13
出版日期: 2023-09-06

微信服务号

微信订阅号

引用本文

相关视频

分享

文章指标

历史

文章二维码

微信服务号

微信订阅号

引用本文

相关视频

分享

微信扫一扫：分享

文章指标

历史

文章二维码