基于多维特征的开源项目个性化推荐方法
作者:
基金项目:

国家自然科学基金(61432020,61472430,61502512);国家重点研发计划(2016YFB1000805)


Multi-Feature Based Personal Recommendation Approach for Open Source Project
Author:
Fund Project:

National Natural Science Foundation of China (61432020, 61472430, 61502512); National Key Research and Development Program (2016YFB1000805)

  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [36]
  • |
  • 相似文献 [20]
  • | | |
  • 文章评论
    摘要:

    随着软件协同开发技术与社交网络的深度融合,社交化开发范式已成为当前软件创作与生产的重要方式.这一软件开发模型的灵活性与开放性,吸引了大规模的外围贡献者加入到开源社区中,形成了巨大的软件生产力.在开源社区中,这些分布广泛、规模巨大的外围贡献者,主要以一种无组织的松散方式进行协同.他们需要花费大量的时间和精力,在海量的开源项目中寻找到自己真正感兴趣的项目并进行长期贡献.为了提高大规模群体协同的效率,提出一种基于多维特征的开源项目个性化推荐方法(即RepoLike).该方法从开源项目自身流行度、关联项目技术相关度以及大众贡献者之间的社交关联度这3个维度度量开发者和开源项目之间的关联关系,并利用线性组合和Learning To Rank方法构建推荐模型,从而为开发者提供个性化的项目推荐服务.通过大规模的实验,其结果表明: RepoLike在推荐20个候选项目时的推荐命中率超过25%,能够有效地为开发人员提供有价值的推荐服务.

    Abstract:

    With the deep integration of software collaborative development and social networking, social coding represents a new style of software production and creation paradigm. Due to the flexibility and openness, a large number of external contributors are attracted to the open source communities. They are playing a significant role in open source development. However, the online open source development is a globalized and distributed cooperative work. If left unsupervised, the contribution process may result in inefficiency. It takes contributors a lot of time to find suitable projects or tasks to work on from thousands of open source projects in the communities. In this paper, a new approach, called RepoLike, is proposed for recommending repositories to developers based on linear combination and learning to rank. It utilizes the project popularity, technical dependencies among projects and social connections among developers to measure the correlations between a developer and the given projects. The experiment results show that this new approach can achieve over 25% of hit ratio when recommending 20 candidates, which means it can recommend closely correlated repositories to social developers.

    参考文献
    [1] Boyd DM, Ellison NB. Social network sites:Definition, history, and scholarship. Journal of Computer-Mediated Communication, 2007, 13(1):210-230.[doi:10.1111/j.1083-6101.2007.00393.x]
    [2] Storey MA, Treude C, van Deursen A, Cheng LT. The impact of social media on software engineering practices and tools. In:Proc. of the FSE/SDP Workshop on Future of Software Engineering Research. 2010. 359-364.[doi:10.1145/1882362.1882435]
    [3] Begel A, DeLine R, Zimmermann T. Social media for software engineering. In:Proc. of the FSE/SDP Workshop on Future of Software Engineering Research. 2010. 33-38.
    [4] Begel A, Bosch J, Storey MA. Social networking meets software development:Perspectives from GitHub, MSDN, stack exchange, and TopCoder. IEEE Software, 2013,30(1):52-66.[doi:10.1109/MS.2013.13]
    [5] Dabbish L, Stuart C, Tsay J, Herbsleb J. Social coding in GitHub:Transparency and collaboration in an open software repository. In:Proc. of the CSCW. 2012. 1277-1286.
    [6] Yu Y, Wang HM, Yin G, Wang T. Reviewer recommendation for pull-requests in GitHub:What can we learn from code review and bug assignment? Information and Software Technology Journal, 2016,74:204-218.[doi:10.1016/j.infsof.2016.01.004]
    [7] Wang HM, Yin G, Xie B, Liu XD, Wei J, Liu JN. Research on network-based large-scale collaborative development and evolution of trustworthy software. Scientia Sinica Informationis, 2014,44(1):1-19(in Chinese with English abstract).
    [8] Zhou M, Mockus A. Does the initial environment impact the future of developers? In:Proc. of the 33rd Int'l Conf. on Software Engineering. ACM Press, 2011. 271-280.[doi:10.1145/1985793.1985831]
    [9] Blincoe K, Harrison F, Damian D. Ecosystems in GitHub and a method for ecosystem idendification using reference coupling. In:Proc. of the Mining Software Repositories (MSR). 2015.[doi:10.1109/MSR.2015.26]
    [10] Zhu J, Shen B, Hu F. A learning to rank framework for developer recommendation in software crowdsourcing. In:Proc. of the 2015 Asia-Pacific Software Engineering Conf. (APSEC). IEEE, 2015. 285-292.[doi:10.1109/APSEC.2015.50]
    [11] Chen X, Zhang Y, Xu T, Qin Z. Learning to rank features for recommendation over multiple categories. In:Proc. of the 39th Int'l ACM SIGIR Conf. on Research and Development in Information Retrieval. ACM Press, 2016. 305-314.[doi:10.1145/2911451. 2911549]
    [12] Yang C, Fan Q, Wang T, Wang HM. RepoLike:Personal repositories recommendation in social coding communities. In:Proc. of the 8th Asia-Pacific Symp. on Internetware on Internetware. ACM Press, 2016.[doi:10.1145/2993717.2993725]
    [13] Ye Y, Fischer G. Information delivery in support of learning reusable software components on demand. In:Proc. of the 2002 Int'l Conf. on Intelligent User Interfaces (IUI 2002). 2002. 159-166.[doi:10.1145/502716.502741]
    [14] Chen MG, Fu C, Xie Q, McMillan C, Poshyvanyk D, Cumby C. A search engine for finding highly relevant applications. In:Proc. of 2010 ACM/IEEE the 32nd Int'l Conf. on Software Engineering. 2010. 475-484.[doi:10.1145/1806799.1806868]
    [15] McMillan C, Poshyvanyk D, Grechanik M. Recommending source code examples via API call usages and documentation. In:Proc. of the 2nd Int'l Workshop on Recommendation Systems for Software Engineering (RSSE 2010). 2010.[doi:10.1145/1808920. 1808925]
    [16] Lozano A, Kellens A, Mens K. Mendel:Source code recommendation based on a genetic metaphor. In:Proc. of the 26th IEEE/ACM Int'l Conf. on Automated Software Engineering. IEEE Computer Society, 2011. 384-387.[doi:10.1109/ASE.2011. 6100078]
    [17] Holmes R, Murphy GC. Using structural context to recommend source code examples. In:Proc. of the ICSE 2005. 2005. 117-125.[doi:10.1145/1062455.1062491]
    [18] Xie T, Pei J. MAPO:Mining API usages from open source repositories. In:Proc. of the 2006 Int'l Workshop on Mining Software Repositories. 2006. 54-57.[doi:10.1145/1137983.1137997]
    [19] Zagalsky A, Barzilay O, Yehudai A. Example overflow:Using social media for code recommendation. In:Proc. of the 3rd Int'l Workshop on Recommendation Systems for Software Engineering. IEEE Press, 2012. 38-42.[doi:10.1109/RSSE.2012.6233407]
    [20] Bajracharya S, Ossher J, Lopes C. Sourcerer:An internet-scale software repository. In:Proc. of the 2009 ICSE Workshop on Search-Driven Development-Users, Infrastructure, Tools and Evaluation. 2009. 1-4.[doi:10.1109/SUITE.2009.5070010]
    [21] Kokkoras F, Ntonas K, Kritikos A, Kakarontzas G, Stamelos I. Federated search for open source software reuse. In:Proc. of the 38th EUROMICRO Conf. on Software Engineering and Advanced Applications (SEAA). 2012. 200-203.[doi:10.1109/SEAA. 2012.55] 1372
    [22] Yin G, Wang T, Wang HM, Fan Q, Zhang Y, Yu Y, Yang C. OSSEAN:Mining crowd wisdom in open source communities. In:Proc. of the 2015 IEEE Symp. on Service-Oriented System Engineering (SOSE). IEEE, 2015. 367-371.[doi:10.1109/SOSE.2015. 51]
    [23] Brandt J, Dontcheva M, Weskamp M, Klemmer SR. Example-Centric programming:Integrating web search into the development environment. In:Proc. of the SIGCHI Conf. on Human Factors in Computing Systems. 2010. 513-522.[doi:10.1145/1753326. 1753402]
    [24] Anvik J, Hiew L, Murphy GC. Who should fix this bug? In:Proc. of the ICSE. 2006. 361-370.[doi:10.1145/1134285.1134336]
    [25] Bhattacharya P, Neamtiu I. Fine-Grained incremental learning and multi-feature tossing graphs to improve bug triaging. In:Proc. of the ICSM. 2010. 1-10.[doi:10.1109/ICSM.2010.5609736]
    [26] Jeong G, Kim S, Zimmermann T. Improving bug triage with bug tossing graphs. In:Proc. of the FSE. 2009. 111-120.[doi:10.1145/1595696.1595715]
    [27] Surian D, Liu N, Lo D, Tong HH, Lim EP, Faloutsos C. Recommending people in developers' collaboration network. In:Proc. of the WCRE. 2011. 379-388.[doi:10.1109/WCRE.2011.53]
    [28] Canfora G, Di Penta M, Oliveto R, Panichella S. Who is going to mentor newcomers in open source projects? In:Proc. of the FSE. 2012. 44.[doi:10.1145/2393596.2393647]
    [29] Allamanis M, Sutton C. Why, when, and what:Analyzing stack overflow questions by topic, type, and code. In:Proc. of the MSR. 2013. 53-56.[doi:10.1109/MSR.2013.6624004]
    [30] Favre JM, Lammel R, Leinberger M, Schmorleiz T, Varanovich A. Linking documentation and source code in a software chrestomathy. In:Proc. of the 19th Working Conf. on Reverse Engineering (WCRE). 2012. 335-344.[doi:10.1109/WCRE. 2012.43]
    [31] Chen X, Grundy J. Improving automated documentation to code traceability by combining retrieval techniques. In:Proc. of the 26th IEEE/ACM Int'l Conf. on Automated Software Engineering. 2011. 223-232.[doi:10.1109/ASE.2011.6100057]
    [32] Rigby PC, Robillard MP. Discovering essential code elements in informal documentation. In:Proc. of the 2013 Int'l Conf. on Software Engineering. 2013. 832-841.[doi:10.1109/ICSE.2013.6606629]
    [33] Bacchelli A, Ponzanelli L, Lanza M. Harnessing stack overflow for the IDE. In:Proc. of the 3rd Int'l Workshop on Recommendation Systems for Software Engineering (RSSE). 2012. 26-30.[doi:10.1109/RSSE.2012.6233404]
    [34] Wang T, Yin G, Wang HM, Yang C, Zou P. Linking stack overflow to issue tracker for issue resolution. In:Proc. of the 6th Asia-Pacific Symp. on Internetware on Internetware. ACM Press, 2014. 11-14.[doi:10.1145/2677832.2677839]
    附中文参考文献:
    [7] 王怀民,尹刚,谢冰,刘旭东,魏峻,刘江宁.基于网络的可信软件大规模协同开发与演化.中国科学:信息科学,2014,44(1):1-19.
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

杨程,范强,王涛,尹刚,王怀民.基于多维特征的开源项目个性化推荐方法.软件学报,2017,28(6):1357-1372

复制
分享
文章指标
  • 点击次数:4581
  • 下载次数: 6969
  • HTML阅读次数: 3918
  • 引用次数: 0
历史
  • 收稿日期:2016-10-17
  • 最后修改日期:2016-10-26
  • 在线发布日期: 2017-02-21
文章二维码
您是第19727565位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号