代码知识图谱构建及智能化软件开发方法研究

doi:10.13328/j.cnki.jos.005893

微信服务号

微信订阅号

2025年4月10日 22:25 星期四

首页 > 过刊浏览>2020年第31卷第1期 >47-66. DOI:10.13328/j.cnki.jos.005893

PDF HTML阅读 XML下载导出引用引用提醒

代码知识图谱构建及智能化软件开发方法研究
DOI:
                        10.13328/j.cnki.jos.005893
                    
CSTR:
                        
                    
作者:
                        王飞王飞
武汉大学 计算机学院, 湖北 武汉 430072
在期刊界中查找
在百度中查找
在本站中查找
刘井平刘井平
复旦大学 计算机科学技术学院, 上海 201203
在期刊界中查找
在百度中查找
在本站中查找
刘斌刘斌
武汉大学 计算机学院, 湖北 武汉 430072
在期刊界中查找
在百度中查找
在本站中查找
钱铁云钱铁云
武汉大学 计算机学院, 湖北 武汉 430072
在期刊界中查找
在百度中查找
在本站中查找
肖仰华肖仰华
复旦大学 计算机科学技术学院, 上海 201203
在期刊界中查找
在百度中查找
在本站中查找
彭智勇彭智勇
武汉大学 计算机学院, 湖北 武汉 430072
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:王飞(1989-),男,江苏连云港人,博士生,主要研究领域为知识图谱,常识挖掘,推荐系统;刘井平(1991-),男,博士生,主要研究领域为知识图谱,常识挖掘,推荐系统;刘斌(1975-),男,博士,讲师,CCF专业会员,主要研究领域为复杂数据管理,数据挖掘;钱铁云(1970-),女,博士,教授,博士生导师,CCF专业会员,主要研究领域为Web挖掘,数据管理,自然语言处理;肖仰华(1980-),男,博士,教授,博士生导师,CCF高级会员,主要研究领域为大数据管理和挖掘,图数据库,知识图谱;彭智勇(1963-),男,博士,教授,博士生导师,CCF会士,主要研究领域为复杂数据管理,可信数据管理,Web数据管理.
通讯作者:钱铁云,E-mail:qty@whu.edu.cn;肖仰华,E-mail:shawyh@fudan.edu.cn;彭智勇,E-mail:peng@whu.edu.cn
中图分类号:TP311
基金项目:国家重点研发计划（2018YFB1003400）；国家自然科学基金（61572376）；中央高校基本科研业务费专项资金（2042019k10278）

Survey on Construction of Code Knowledge Graph and Intelligent Software Development

Author:

WANG Fei
WANG Fei
School of Computer Science, Wuhan University, Wuhan 430072, China
在期刊界中查找
在百度中查找
在本站中查找
LIU Jing-Ping
LIU Jing-Ping
School of Computer Science, Fudan University, Shanghai 201203, China
在期刊界中查找
在百度中查找
在本站中查找
LIU Bin
LIU Bin
School of Computer Science, Wuhan University, Wuhan 430072, China
在期刊界中查找
在百度中查找
在本站中查找
QIAN Tie-Yun
QIAN Tie-Yun
School of Computer Science, Wuhan University, Wuhan 430072, China
在期刊界中查找
在百度中查找
在本站中查找
XIAO Yang-Hua
XIAO Yang-Hua
School of Computer Science, Fudan University, Shanghai 201203, China
在期刊界中查找
在百度中查找
在本站中查找
PENG Zhi-Yong
PENG Zhi-Yong
School of Computer Science, Wuhan University, Wuhan 430072, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

National Key Research and Development Program of China (2018YFB1003400); National Natural ScienceFoundation of China (61572376); Fundamental Research Funds for the Central Universities (2042019k10278)

摘要

图/表

访问统计

参考文献 [118]

相似文献 [20]

引证文献

资源附件

文章评论

摘要:

智能化软件开发正在经历从简单的代码检索到语义赋能的代码自动生成的转变，传统的语义表达方式无法有效地支撑人、机器和代码之间的语义交互，探索机器可理解的语义表达机制迫在眉睫.首先指出了代码知识图谱是实现智能化软件开发的基础，进而分析了大数据时代智能化软件开发的新特点以及基于代码知识图谱进行智能化软件开发的新挑战；随后回顾了智能化软件开发和代码知识图谱的研究现状，指出了现有智能化软件开发的研究仍然处于较低水平，而现有知识图谱的研究主要面向开放领域知识图谱，无法直接应用于代码领域知识图谱.因此，从代码知识图谱的建模与表示、构建与精化、存储与演化管理、查询语义理解以及智能化应用这5个方面详细探讨了研究新趋势，以更好地满足基于代码知识图谱进行智能化软件开发的需要.

关键词:智能化软件开发;知识图谱;代码大数据

Abstract:

The intelligent software development is migrating from simple code retrieval to semantic empowered automatic code generation. Traditional semantic representation cannot effectively support the semantic interaction among people, machines, and code. It becomes an urgent task to design a set of machine-readable semantic representation. In tThis paper, westudy firstly points out that code knowledge graph forms the basis to realize the intelligent software development, and then analyzes the new features and new challenges of intelligent software development based on code knowledge graph in the era of big data. Next, we review the research progress is reviewed both in intelligent software development and in code knowledge graph. It is noted that the current research of intelligent software development is still at a preliminary stage. Existing studies of knowledge graph mainly focus on open-domain knowledge graph, and they cannot be directly applied to code and software development domain. Therefore, we discuss the new research trends of code knowledge graph are discussed in detail from five aspects, including namely modeling and representation, construction and refinement, storage and evolution management, semantic understanding, and intelligent application, which are essential to meet the various types of demands of the intelligent software development.

Key words:intelligent software development;knowledge graph;big code

参考文献

[1] Economic operation of software industry in 2017. 2017. http://www.miit.gov.cn/n1146285/n1146352/n3054355/n3057656/n5340637/c6040371/content.html

[2] PWC global 100 software leaders. 2016. https://www.pwc.com/gx/en/technology/publications/global-software-100-leaders/assets/global-100-software-leaders-2016.pdf

[3] Liu L. Industrial application and future development of knowledge graph. The Internet Economy, 2018,(4):16-21(in Chinese with English abstract).[doi:10.19609/j.cnki.cn10-1255/f.2018.04.003]

[4] GitHub. 2018. https://octoverse.github.com

[5] Schmidt DC. Model-driven engineering. IEEE Computer, 2006,39(2):25-31.

[6] Manna Z, Waldinger RJ. Toward automatic program synthesis. Communications of the ACM, 1971,14(3):151-165.[doi:10.1145/362566.362568]

[7] Liu BB, Dong W, Wang J. Survey on intelligent search and construction methods of program. Ruan Jian Xue Bao/Journal of Software, 2018,29(8):2180-2197(in Chinese with English abstract). http://www.jos.org.cn/1000-9825/5529.htm[doi:10.13328/j.cnki.jos.005529]

[8] Allamanis M, Barr ET, Devanbu P, Sutton C. A survey of machine learning for big code and naturalness. ACM Computing Surveys, 2018,51(4):1-37.[doi:10.1145/3212695]

[9] Google search. http://www.google.com

[10] Baidu search. http://www.baidu.com

[11] Krugle. http://www.krugle.com

[12] Search code. http://www.searchcode.com

[13] Thummalapenta S, Xie T. PARESEWeb:A programmer assistant for reusing open source code on the Web. In:Proc. of the 22nd IEEE/ACM Int'l Conf. on Automated Software Engineering. ACM Press, 2007. 204-213.[doi:10.1145/1321631.1321663]

[14] Stolee KT. Finding suitable programs:Semantic search with incomplete and lightweight specifications. In:Proc. of the 34th Int'l Conf. on Software Engineering. 2012. 1571-1574.[doi:10.1109/icse.2012.6227034]

[15] Keivanloo I, Rilling J, Zou Y. Spotting working code examples. In:Proc. of the 36th Int'l Conf. on Software Engineering. 2014. 664-675.[doi:10.1145/2568225.2568292]

[16] Chatterjee S, Juvekar S, Sen K. SNIFF:A search engine for java using free-form queries. In:Proc. of the Int'l Conf. on Fundamental Approaches to Software Engineering. Berlin, Heidelberg:Springer-Verlag, 2009. 385-400.[doi:10.1007/978-3-642-00593-0_26]

[17] Nie LM, Jiang H, Ren ZL, Sun ZY, Li XC. Query expansion based on crowd knowledge for code search. IEEE Trans. on Services Computing, 2016,9(5):771-783.[doi:10.1109/tsc.2016.2560165]

[18] Lü F, Zhang HY, Lou JG, Wang SW, Zhang DM, Zhao JJ. CodeHow:Effective code search based on API understanding and extended Boolean model. In:Proc. of the 30th IEEE/ACM Int'l Conf. on Automated Software Engineering. IEEE, 2015. 260-270.[doi:10.1109/ase.2015.42]

[19] Rahman MM, Roy CK, Lo D. RACK:Code search in the IDE using crowdsourced knowledge. In:Proc. of the 39th Int'l Conf. on Software Engineering Companion. IEEE, 2017. 51-54.[doi:10.1109/icse-c.2017.11]

[20] Gu XD, Zhang HY, Kim S. Deep code search. In:Proc. of the 40th Int'l Conf. on Software Engineering. IEEE, 2018. 933-944.[doi:10.1145/3180155.3180167]

[21] Halstead MH. Elements of Software Science (Operating and Programming Systems Series). New York:Elsevier Science Inc., 1977.

[22] Cosma G, Joy M. An approach to source-code plagiarism detection and investigation using latent semantic analysis. IEEE Trans. on Computers, 2012,61(3):379-394.[doi:10.1109/TC.2011.223]

[23] Đurić Z, Gašević D. A source code similarity system for plagiarism detection. The Computer Journal, 2013,56(1):70-86.[doi:10.1093/comjnl/bxs018]

[24] Alon U, Zilberstein M, Levy O, Yahav E. Code2vec:Learning distributed representations of code. Proc. of the ACM on Programming Languages, 2019,40(3):1-29.[doi:10.1145/3290353]

[25] Zhang HY, Jain A, Khandelwal G, Kaushik C, Ge S, Hu WX. Bing developer assistant:Improving developer productivity by recommending sample code. In:Proc. of the 24th ACM SIGSOFT Int'l Symp. on Foundations of Software Engineering. ACM Press, 2016. 956-961.[doi:10.1145/2950290.2983955]

[26] Niu HR, Keivanloo I, Zou Y. Learning to rank code examples for code search engines. Empirical Software Engineering, 2016,22(1):259-291.[doi:10.1007/s10664-015-9421-5]

[27] Raychev V, Vechev M, Yahav E. Code completion with statistical language models. ACM SIGPLAN Notices, 2014,49(6):419-428.[doi:10.1145/2594291.2594321]

[28] Tu ZP, Su ZD, Devanbu P. On the localness of software. In:Proc. of the 22nd ACM SIGSOFT Int'l Symp. on Foundations of Software Engineering. ACM Press, 2014. 269-280.[doi:10.1145/2635868.2635875]

[29] Nguyen TT, Nguyen AT, Nguyen HA, Nguyen TN. A statistical semantic language model for source code. In:Proc. of the 9th Joint Meeting on Foundations of Software Engineering. 2013. 532-542.[doi:10.1145/2491411.2491458]

[30] Nguyen AT, Nguyen TN. Graph-based statistical language model for code. In:Proc. of the 37th IEEE/ACM Int'l Conf. on Software Engineering. 2015. 858-868.[doi:10.1109/icse.2015.336]

[31] Li J, Wang Y, Lyu MR, King I. Code completion with neural attention and pointer networks. arXiv Preprint arXiv:1711.09573, 2017. http://arxiv.org/abs/1711.09573

[32] Zhang C, Yang JY, Zhang Y, Fan J, Zhang X, Zhao JJ, Ou PZ. Automatic parameter recommendation for practical API usage. In:Proc. of the 34th Int'l Conf. on Software Engineering. IEEE, 2012. 826-836.[doi:10.1109/ICSE.2012.6227136]

[33] Gulwani S. Automating string processing in spreadsheets using input-output examples. ACM SIGPLAN Notices, 2011,46(1):317-330.[doi:10.1145/1925844.1926423]

[34] Desai A, Gulwani S, Hingorani V, Jain N, Karkare A, Marron M, Sailesh R, Roy S. Program synthesis using natural language. In:Proc. of the 38th Int'l Conf. on Software Engineering. ACM Press, 2016. 345-356.[doi:10.1145/2884781.2884786]

[35] Raza M, Gulwani S, Milic-Frayling N. Compositional program synthesis from natural language and examples. In:Proc. of the 24th Int'l Joint Conf. on Artificial Intelligence. 2015. 792-800. https://dl.acm.org/citation.cfm?id=2832359

[36] Raghothaman M, Wei Y, Hamadi Y. SWIM:Synthesizing what I mean:Code search and idiomatic snippet synthesis. In:Proc. of the 38th Int'l Conf. on Software Engineering. ACM Press, 2016. 357-367.[doi:10.1145/2884781.2884808]

[37] Wang YP, Feng Y, Martins R, Kaushik A, Dillig I, Reiss S P. Hunter:Next-generation code reuse for Java. In:Proc. of the 24th ACM SIGSOFT Int'l Symp. on Foundations of Software Engineering. ACM Press, 2016. 1028-1032.[doi:10.1145/2950290. 2983934]

[38] Cambronero J, Li HY, Kim S, Sen K, Chandra S. When deep learning met code search. arXiv Preprint arXiv:1905.03813, 2019.

[39] Etzioni O, Cafarella M, Downey D, Kok S, Popescu AM, Shaked T, Soderland S, Weld D S, Yates A. Web-scale information extraction in knowitall:(Preliminary results). In:Proc. of the 13th Int'l Conf. on World Wide Web. 2004. 100-110.[doi:10.1145/988672.988687]

[40] Yates A, Banko M, Broadhead M, Cafarella M, Etzioni O, Soderland S. Textrunner:Open information extraction on the Web. In:Proc. of the Human Language Technologies:The Annual Conf. of the North American Chapter of the Association for Computational Linguistics:Demonstrations. 2007. 25-26.[doi:10.3115/1614164.1614177]

[41] Wu WT, Li HS, Wang HX, Zhu KQ. Probase:A probabilistic taxonomy for text understanding. In:Proc. of the 2012 ACM SIGMOD Int'l Conf. on Management of Data. 2012. 481-492.[doi:10.1145/2213836.2213891]

[42] Suchanek FM, Kasneci G, Weikum G. Yago:A core of semantic knowledge unifying wordnet and wikipedia. In:Proc. of the 16th Int'l Conf. on World Wide Web. 2007. 697-706.[doi:10.1145/1242572.1242667]

[43] Auer S, Bizer C, Kobilarov G, Lehmann J, Cyganiak R, Ives Z. DBpedia:A nucleus for a Web of open data. In:Proc. of the 6th Int'l Semantic Web Conf. 2008. 722-735.[doi:10.1007/978-3-540-76298-0_52]

[44] Li WP, Wang JB, Lin ZQ, Zhao JF, Zou YZ, Xie B. Software knowledge graph building method for open source project. Journal of Frontiers of Computer Science and Technology, 2017,11(6):851-862(in Chinese with English abstract).[doi:10.3778/j.issn.1673-9418.1609026]

[45] Zhao XJ, Xing ZC, Kabir MA, Sawada N, Li J, Lin SW. HDSKG:Harvesting domain specific knowledge graph from content of webpages. In:Proc. of the 24th IEEE Int'l Conf. on Software Analysis, Evolution, and Reengineering. IEEE, 2017. 56-67.[doi:10.1109/SANER.2017.7884609]

[46] Zhou C, Li B, Sun XB, Guo HJ. Recognizing software bug-specific named entity in software bug repository. In:Proc. of the 26th Conf. on Program Comprehension. ACM Press, 2018. 108-119.[doi:10.1145/3196321.3196335]

[47] Liu K, Zhang YZ, Ji GL, Lai SW, Zhao J. Representation learning for question answering over knowledge base:an overview. Acta Automatica Sinica, 2016,42(6):807-818(in Chinese with English abstract).[doi:10.16383/j.aas.2016.c150674]

[48] Shi C, Sun YZ, Yu PS. Research status and future development of heterogeneous information network. Communications of the CCF, 2017,13(11):35-40(in Chinese with English abstract). https://www.ccf.org.cn/c/2017-11-15/619584.shtml

[49] Bordes A, Usunier N, Garcia-Duran A, Weston J, Yakhnenko O. Translating embeddings for modeling multi-relational data. In:Proc. of the 26th Int'l Conf. on Neural Information Processing Systems. 2013. 2787-2795. http://papers.nips.cc/paper/5071-translating-embeddings-for-modeling-multi-rela

[50] Xiao H, Huang ML, Yu H, Zhu XY. TransA:An adaptive approach for knowledge graph embedding. arXiv Preprint arXiv:1509. 05490, 2015. http://arxiv.org/abs/1509.05490

[51] Xiao H, Huang ML, Zhu XY. TransG:A generative model for knowledge graph embedding. In:Proc. of the 54th Annual Meeting of the Association for Computational Linguistics. 2016. 2316-2325.[doi:10.18653/v1/P16-1219]

[52] Nickel M, Tresp V, Kriegel HP. A three-way model for collective learning on multi-relational data. In:Proc. of the 28th Int'l Conf. on Machine Learning. 2011. 809-816. http://dl.acm.org/citation.cfm?id=3104584

[53] He WQ, Feng YS, Zou L, Zhao DY. Knowledge base completion using matrix factorization. In:Proc. of the 18th Asia Pacific Web Conf. 2015. 256-267.[doi:10.1007/978-3-319-25255-1_21]

[54] He SZ, Liu K, Ji GL, Zhao J. Learning to represent knowledge graphs with gaussian embedding. In:Proc. of the 24th ACM Int'l on Conf. on Information and Knowledge Management. 2015. 623-632.[doi:10.1145/2806416.2806502]

[55] Xiao H, Huang ML, Zhu XY. From one point to a manifold:Knowledge graph embedding for precise link prediction. arXiv Preprint arXiv:1512.04792, 2015. http://arxiv.org/abs/1512.04792

[56] Mintz M, Bills S, Snow R, Jurafsky D. Distant supervision for relation extraction without labeled data. In:Proc. of the Joint Conf. of the 47th Annual Meeting of the ACL and the 4th Int'l Joint Conf. on Natural Language Processing of the AFNLP, Vol. 2. 2009. 1003-1011.[doi:10.3115/1690219.1690287]

[57] Bao KF, Gu JZ, Yang J. Knowledge graph completion method based on jointly representation of structure and text. Computer Engineering, 2018,44(7):205-211(in Chinese with English abstract).[doi:10.19678/j.issn.1000-3428.0047598]

[58] Lin YK, Shen SQ, Liu ZY, Luan HB, Sun MS. Neural relation extraction with selective attention over instances. In:Proc. of the 54th Annual Meeting of the Association for Computational Linguistics. 2016. 2124-2133.[doi:10.18653/v1/P16-1200]

[59] Mikolov T, Chen K, Corrado G, Dean J. Efficient estimation of word representations in vector space. arXiv Preprint arXiv:1301. 3781, 2013. http://arxiv.org/abs/1301.3781

[60] Xie RB, Liu ZY, Jia J, Luan HB, Sun MS. Representation learning of knowledge graphs with entity descriptions. In:Proc. of the 30th AAAI Conf. on Artificial Intelligence. 2016. 2659-2665.[doi:10.1016/j.patrec.2016.09.005]

[61] Hamaguchi T, Oiwa H, Shimbo M, Matsumoto Y. Knowledge transfer for out-of-knowledge-base entities:A graph neural network approach. arXiv Preprint arXiv:1706.05674, 2017. http://arxiv.org/abs/1706.05674

[62] Naiman CF, Ouksel AM. A classification of semantic conflicts in heterogeneous database systems. Journal of Organizational Computing and Electronic Commerce, 1995,5(2):167-193.[doi:10.1080/10919399509540248]

[63] Nickel M, Tresp V, Kriegel HP. Factorizing YAGO:Scalable machine learning for linked data. In:Proc. of the 21st Int'l Conf. on World Wide Web. 2012. 271-280.[doi:10.1145/2187836.2187874]

[64] Nuzzolese AG, Gangemi A, Presutti V, Ciancarini P. Type inference through the analysis of Wikipedia links. In:Proc. of Linked Data on the Web, Vol.937 of CEUR Workshop Proc. 2012. http://ceur-ws.org/Vol-937/ldow2012-paper-13.pdf

[65] Paulheim H, Bizer C. Type inference on noisy RDF data. In:Proc. of the Int'l Semantic Web Conf. 2013. 510-525.[doi:10.1007/978-3-642-41335-3_32]

[66] Sleeman J, Finin T. Type prediction for efficient coreference resolution in heterogeneous semantic graphs. In:Proc. of the 7th IEEE Int'l Conf. on Semantic Computing. 2013. 78-85.[doi:10.1109/ICSC.2013.22]

[67] Socher R, Chen DQ, Manning CD, Ng AY. Reasoning with neural tensor networks for knowledge base completion. In:Proc. of the 26th Int'l Conf. on Neural Information Processing Systems. 2013. 926-934. http://papers.nips.cc/paper/5028-reasoning-withneural-tensor-networks-for-knowledge-base-completion

[68] Fouss F, Pirotte A, Renders JM, Saerens M. Random-walk computation of similarities between nodes of a graph with application to collaborative recommendation. IEEE Trans. on Knowledge and Data Engineering, 2007,19(3):355-369.[doi:10.1109/tkde. 2007.46]

[69] Backstrom L, Leskovec J. Supervised random walks:Predicting and recommending links in social networks. In:Proc. of the 4th ACM Int'l Conf. on Web Search and Data Mining. 2011. 635-644.[doi:10.1145/1935826.1935914]

[70] Xie RB, Liu ZY, Jia J, Luan HB, Sun MS. Representation learning of knowledge graphs with entity descriptions. In:Proc. of the 30th Conf. on Artificial Intelligence. 2012. 2659-2665.[doi:10.1016/j.patrec.2016.09.005]

[71] Dong XL, Gabrilovich E, Heitz G, Horn W, Lao N, Murphy K, Strohmann T, Sun SH, Zhang W. Knowledge vault:A Web-scale approach to probabilistic knowledge fusion. In:Proc. of the 20th ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining. 2014. 601-610.[doi:10.1145/2623330.2623623]

[72] Paulheim H, Bizer C. Improving the quality of linked data using statistical distributions. Int'l Journal on Semantic Web and Conf. Systems, 2014,10(2):63-86.[doi:10.4018/ijswis.2014040104]

[73] Liang JQ, Xiao YH, Wang HX, Zhang Y, Wang W. Probase+:Inferring missing links in conceptual taxonomies. IEEE Trans. on Knowledge and Data Engineering, 2017,29(6):1281-1295.[doi:10.1109/TKDE.2017.2653115]

[74] Lange D, Böhm C, Naumann F. Extracting structured information from wikipedia articles to populate infoboxes. In:Proc. of the 19th ACM Int'l Conf. on Information and Knowledge Management. 2010. 1661-1664.[doi:10.1145/1871437.1871698]

[75] Wu F, Hoffmann R, Weld DS. Information extraction from Wikipedia:Moving down the long tail. In:Proc. of the 14th ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining. 2008. 731-739.[doi:10.1145/1401890.1401978]

[76] Aprosio AP, Giuliano C, Lavelli A. Automatic expansion of dbpedia exploiting Wikipedia cross-language information. In:Proc. of Extend Semantic Web Conf. 2013. 397-411.[doi:10.1007/978-3-642-38288-8_27]

[77] West R, Gabrilovich E, Murphy K, Sun SH, Gupta R, Lin DK. Knowledge base completion via search-based question answering. In:Proc. of the 23rd Int'l Conf. on World Wide Web. 2014. 515-526.[doi:10.1145/2566486.2568032]

[78] Ritze D, Lehmberg O, Bizer C. Matching HTML tables to dbpedia. In:Proc. of the 5th Int'l Conf. on Web Intelligence, Mining and Semantics. 2015. 10.[doi:10.1145/2797115.2797118]

[79] Wienand D, Paulheim H. Detecting incorrect numerical data in DBpedia. In:Proc. of the European Semantic Web Conf. 2014. 504-518.[doi:10.1007/978-3-319-07443-6_34]

[80] Neumann T, Weikum G. Rdf-3x:A RISC-style engine for RDF. VLDB Endowment, 2008,1(1):647-659.[doi:10.14778/1453856. 1453927]

[81] Chong EI, Das S, Eadon G, Srinivasan J. An efficient SQL-based RDF querying scheme. In:Proc. of the 31st Int'l Conf. on Very Large Data Bases. 2005. 1216-1227. http://dl.acm.org/citation.cfm?id=1083734

[82] Abadi DJ, Marcus A, Madden SR, Hollenbach K. Scalable semantic Web data management using vertical partitioning. In:Proc. of the 33rd Int'l Conf. on Very Large Data Bases. 2007. 411-422. http://dl.acm.org/citation.cfm?id=1325900

[83] Sun W, Fokoue A, Srinivas K, Kementsietsidis A, Hu G, Xie GT. SQLGraph:An efficient relational-based property graph store. In:Proc. of the 2015 ACM SIGMOD Int'l Conf. on Management of Data. 2015. 1887-1901.[doi:10.1145/2723372.2723732]

[84] Bornea MA, Dolby J, Kementsietsidis A, Srinivas K, Dantressangle P, Udrea O, Bhattacharjee B. Building an efficient RDF store over a relational database. In:Proc. of the 2013 ACM SIGMOD Int'l Conf. on Management of Data. 2013. 121-132.[doi:10.1145/2463676.2463718]

[85] Zou L, Özsu MT, Chen L, Shen XC, Huang RZ, Zhao DY. gStore:A graph-based SPARQL query engine. The Int'l Journal on Very Large Data Bases, 2014,23(4):565-590. http://link.springer.com/article/10.1007/s00778-013-0337-7[doi:10.1007/s00778-013-0337-7]

[86] Shen XC, Zou L, Özsu MT, Chen L, Li YH, Han S, Zhao DY. A graph-based RDF triple store. In:Proc. of the 31st IEEE Int'l Conf. on Data Engineering. 2015. 1508-1511.[doi:10.1109/ICDE.2015.7113413]

[87] Wang M, Zou YZ, Cao YK, Xie B. Searching software knowledge graph with question. In:Proc. of the 18th Int'l Conf. on Software and Systems Reuse. Springer-Verlag, 2019. 115-131.[doi:10.1007/978-3-030-22888-0_9]

[88] Zhang FZ, Yuan NJ, Lian DF, Xie X, Ma WY. Collaborative knowledge base embedding for recommender systems. In:Proc. of the 22nd ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining. 2016. 353-362.[doi:10.1145/2939672.2939673]

[89] Pennington J, Socher R, Manning CD. GloVe:Global vectors for word representation. In:Proc. of the 2014 Conf. on Empirical Methods in Natural Language Processing. 2014. 1532-1543.[doi:10.3115/v1/D14-1162]

[90] Li YH, Bandar ZA, Mclean D. An approach for measuring semantic similarity between words using multiple information sources. IEEE Trans. on Knowledge and Data Engineering, 2003,15(4):871-882.[doi:10.1109/TKDE.2003.1209005]

[91] Zhu GG, Iglesias CA. Computing semantic similarity of concepts in knowledge graphs. IEEE Trans. on Knowledge and Data Engineering, 2017,29(1):72-85.[doi:10.1109/TKDE.2016.2610428]

[92] Chakrabarti S. Dynamic personalized pagerank in entity-relation graphs. In:Proc. of the 16th Int'l Conf. on World Wide Web. 2007. 571-580.[doi:10.1145/1242572.1242650]

[93] Yu X, Sun YZ, Norick B, Mao TC, Han JW. User guided entity similarity search using meta-path selection in heterogeneous information networks. In:Proc. of the 21st ACM Int'l Conf. on Information and Knowledge Management. 2012. 2025-2029.[doi:10.1145/2396761.2398565]

[94] Yu X, Ren X, Sun YZ, Gu QQ, Sturt B, Khandelwal U, Norick B, Han JW. Personalized entity recommendation:A heterogeneous information network approach. In:Proc. of the 7th ACM Int'l Conf. on Web Search and Data Mining. 2014. 283-292.[doi:10.1145/2556195.2556259]

[95] Cui WY, Xiao YH, Wang HX, Song YQ, Hwang SW, Wang W. KBQA:Learning question answering over QA corpora and knowledge bases. VLDB Endowment, 2017,10(5):565-576.[doi:10.14778/3055540.3055549]

[96] Paulheim H. Knowledge graph refinement:A survey of approaches and evaluation methods. Semantic Web, 2017,8(3):489-508.[doi:10.3233/SW-160218]

[97] Wang CG, Sun YZ, Song YL, Han JW, Song YQ, Wang LD, Zhang M. RelSim:Relation similarity search in schema-rich heterogeneous information networks. In:Proc. of the 2016 SIAM Int'l Conf. on Data Mining. 2016. 621-629.[doi:10.1137/1.9781611974348.70]

[98] Sun YZ, Aggarwal CC, Han JW. Relation strength-aware clustering of heterogeneous information networks with incomplete attributes. VLDB Endowment, 2012,5(5):394-405.[doi:10.14778/2140436.2140437]

[99] Jayaram N, Gupta M, Khan A, Li CK, Yan XF, Elmasri R. GQBE:Querying knowledge graphs by example entity tuples. In:Proc. of the 30th IEEE Int'l Conf. on Data Engineering. 2014. 1250-1253.[doi:10.1109/TKDE.2015.2426696]

[100] Zheng WG, Cheng H, Zou L, Yu JX, Zhao KF. Natural language question/answering:Let users talk with the knowledge graph. In:Proc. of the 2017 ACM on Conf. on Information and Knowledge Management. 2017. 217-226.[doi:10.1145/3132847.3132977]

[101] Diaz G, Arenas M, Benedikt M. SPARQLByE:Querying RDF data by example. VLDB Endowment, 2016,9(13):1533-1536.[doi:10.14778/3007263.3007302]

[102] Li GL, Ooi BC, Feng JH, Wang JY, Zhou LZ. EASE:An effective 3-in-1 keyword search method for unstructured, semi-structured and structured data. In:Proc. of the 2008 ACM SIGMOD Int'l Conf. on Management of Data. 2008. 903-914.[doi:10.1145/1376616.1376706]

[103] Tran T, Wang HF, Rudolph S, Cimiano P. Top-k exploration of query candidates for efficient keyword search on graph-shaped (RDF) data. In:Proc. of the 25th IEEE Int'l Conf. on Data Engineering. 2009. 405-416.[doi:10.1109/ICDE.2009.119]

[104] Elbassuoni S, Blanco R. Keyword search over RDF graphs. In:Proc. of the 20th ACM Int'l Conf. on Information and Knowledge Management. 2011. 237-242.[doi:10.1145/2063576.2063615]

[105] Wu YH, Yang SQ, Srivatsa M, Iyengar A, Yan XF. Summarizing answer graphs induced by keyword queries. VLDB Endowment, 2013,6(14):1774-1785.[doi:10.14778/2556549.2556561]

[106] Shan Y, Li MD, Chen Y. Constructing target-aware results for keyword search on knowledge graphs. Data & Knowledge Engineering, 2017,110:1-23.[doi:10.1016/j.datak.2017.02.001]

[107] Han S, Zou L, Yu JX, Zhao DY. Keyword search on RDF graphs-A query graph assembly approach. In:Proc. of the 2017 ACM Conf. on Information and Knowledge Management. 2017. 227-236.[doi:10.1145/3132847.3132957]

[108] Lin ZQ, Xie B, Zou YZ, Zhao JF, Li XD, Wei J, Sun HL, Yin G. Intelligent development environment and software knowledge graph. Journal of Computer Science and Technology, 2017,32(2):242-249.[doi:10.1007/s11390-017-1718-y]

[109] Li HW, Li S, Sun J, Xing ZC. Improving API caveats accessibility by mining API caveats knowledge graph. In:Proc. of the 2018 IEEE Int'l Conf. on Software Maintenance and Evolution. IEEE Computer Society, 2018. 183-193.[doi:10.1109/ICSME.2018. 00028]

[110] Lu ML, Sun XB, Wang SW, Lo D, Duan YC. Query expansion via wordnet for effective code search. In:Proc. of the 22nd Int'l Conf. on Software Analysis, Evolution and Reengineering. IEEE, 2015. 545-549.[doi:10.1109/saner.2015.7081874]

[111] Lemos OAL, de Paula AC, Zanichelli FC, Lopes CV. Thesaurus-based automatic query expansion for interface-driven code search. In:Proc. of the 11th Working Conf. on Mining Software Repositories. ACM Press, 2014. 212-221.[doi:10.1145/2597073. 2597087]

附中文参考文献:

[3] 刘柳.知识图谱的行业应用与未来发展.互联网经济,2018,(4):16-21.[doi:10.19609/j.cnki.cn10-1255/f.2018.04.003]

[7] 刘斌斌,董威,王戟.智能化的程序搜索与构造方法综述.软件学报,2018,29(8):2180-2197. http://www.jos.org.cn/1000-9825/5529.htm[doi:10.13328/j.cnki.jos.005529]

[44] 李文鹏,王建彬,林泽琦,赵俊峰,邹艳珍,谢冰.面向开源软件项目的软件知识图谱构建方法.计算机科学与探索,2017,11(6):851-862.[doi:10.3778/j.issn.1673-9418.1609026]

[47] 刘康,张元哲,纪国良,来斯惟,赵军.基于表示学习的知识库问答研究进展与展望.自动化学报,2016,42(6):807-818.[doi:10. 16383/j.aas.2016.c150674]

[48] 石川,孙怡舟,菲利普·俞.异质信息网络的研究现状和未来发展.中国计算机学会通讯,2017,13(11):35-40. https://www.ccf.org.cn/c/2017-11-15/619584.shtml

[57] 鲍开放,顾君忠,杨静.基于结构和文本联合表示的知识图谱补全方法.计算机工程,2018,44(7):205-211.[doi:10.19678/j.issn. 1000-3428.0047598]

引用本文

王飞,刘井平,刘斌,钱铁云,肖仰华,彭智勇.代码知识图谱构建及智能化软件开发方法研究.软件学报,2020,31(1):47-66

复制

文章指标

点击次数:6874
下载次数: 12322
HTML阅读次数: 5239
引用次数: 0

历史

收稿日期:2019-01-14
最后修改日期:2019-06-24
录用日期:
在线发布日期: 2019-11-07
出版日期: 2020-01-06

微信服务号

微信订阅号

引用本文

分享

文章指标

历史

文章二维码

微信服务号

微信订阅号

引用本文

分享

微信扫一扫：分享

文章指标

历史

文章二维码