基于对象类型的API补全方法

doi:10.13328/j.cnki.jos.006559

微信服务号

微信订阅号

2025年3月18日 0:50 星期二

首页 > 过刊浏览>2022年第33卷第5期 >1736-1757. DOI:10.13328/j.cnki.jos.006559

PDF HTML阅读 XML下载导出引用引用提醒

基于对象类型的API补全方法
DOI:
                        10.13328/j.cnki.jos.006559
                    
CSTR:
                        
                    
作者:
                        唐泽唐泽
计算机软件新技术国家重点实验室(南京大学), 江苏 南京 210023
在期刊界中查找
在百度中查找
在本站中查找
李传艺李传艺
计算机软件新技术国家重点实验室(南京大学), 江苏 南京 210023
在期刊界中查找
在百度中查找
在本站中查找
葛季栋葛季栋
计算机软件新技术国家重点实验室(南京大学), 江苏 南京 210023
在期刊界中查找
在百度中查找
在本站中查找
骆斌骆斌
计算机软件新技术国家重点实验室(南京大学), 江苏 南京 210023
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:
通讯作者:李传艺,E-mail:lcy@nju.edu.cn
中图分类号:TP311
基金项目:国家自然科学基金(61802167, 61972197, 61802095); 江苏省自然科学基金(BK20201250); 华为-南京大学下一代程序设计创新实验室合作协议子项目

Method of API Completion Based on Object Type

Author:

TANG Ze
TANG Ze
State Key Laboratory for Novel Software Technology (Nanjing University), Nanjing 210023, China
在期刊界中查找
在百度中查找
在本站中查找
LI Chuan-Yi
LI Chuan-Yi
State Key Laboratory for Novel Software Technology (Nanjing University), Nanjing 210023, China
在期刊界中查找
在百度中查找
在本站中查找
GE Ji-Dong
GE Ji-Dong
State Key Laboratory for Novel Software Technology (Nanjing University), Nanjing 210023, China
在期刊界中查找
在百度中查找
在本站中查找
LUO Bin
LUO Bin
State Key Laboratory for Novel Software Technology (Nanjing University), Nanjing 210023, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献 [39]

相似文献

引证文献

资源附件

文章评论

摘要:

近年来, 随着软件技术在各行各业、不同领域的应用不断扩展与深入, 同时伴随着软件架构、服务计算等技术的不断发展, 软件行业涌现出了功能丰富且规模庞大的第三方API或库, 软件开发者在实现软件功能的时候也越来越依赖这些API. 但学习这些API的使用是非常困难且耗时的, 主要有两方面的原因: 1)相关文档的缺失和错误; 2)相关API用法的示例代码较少. 因此, 研究自动的API补全方法以帮助开发人员在开发过程中正确且快速的使用API, 具有很大的应用价值. 然而, 现有API自动补全方案多数将待补全代码段看作纯文本, 忽略了API所属对象类型对预测API的影响. 为此, 探究了对象类型对补全API的作用, 并且在对象状态图的启发下, 设计了一种使用API所属对象的类型作为特征的补全方法. 具体而言, 首先从API调用序列中先抽取同一对象类型的子序列, 利用深度学习模型编码出每个对象的状态, 再利用对象状态生成整个方法块的状态表示进行补全. 为了验证提出的补全方法, 在6个流行Java项目上进行了验证. 实验结果证明, 提出的考虑对象类型的API补全方法在预测准确率上明显高于基线模型.

关键词:API补全;对象类型;插件

Abstract:

In recent years, with the continuous expansion and deepening of the application of software technology in various industries and fields, as well as the development of software architecture, services computing, etc., the software industry has emerged with feature-rich and large-scale third-party APIs or Libraries. Software developers are increasingly relying on these APIs when implementing software functions. However, learning the usage of these APIs is very difficult and time-consuming. There are two main reasons: 1) missing or wrong documents; 2) few sample codes for API usage. Therefore, designing automatic API completion methods to help developers use the API correctly and quickly has great application value. However, most of the existing API automatic completion methods regard the code segments to be completed as plain text, ignore the impact of the object types of APIs. Therefore, this study explores the role of the object types in completing APIs. Besides, inspired by the object state diagram, an concrete API completion method is designed and implemented that uses the types of the objects as a novel feature. Specifically, the subsequence of the same object type is first extracted from the API call sequence and a deep learning model is used to encode the state of each object. Then, the objects’ states is used to generate a state representation of the entire method block. In order to evaluate the proposed method, comprehensive experiments are conducted on six popular java projects. The experimental results prove that the proposed API completion method achieves significantly higher predicting accuracy than the baseline approaches.

Key words:API completion;object type;plug-in

参考文献

[1] Nguyen TD, Nguyen AT, Phan HD, Nguyen TN. Exploring API embedding for API usages and applications. In: Proc. of the 39th IEEE/ACM Int’l Conf. on Software Engineering (ICSE). Buenos Aires: IEEE, 2017. 438–449.

[2] Zhou Y, Gu RH, Chen TL, Huang ZQ, Panichella S, Gall H. Analyzing APIs documentation and code to detect directive defects. In: Proc. of the 39th IEEE/ACM Int’l Conf. on Software Engineering (ICSE). Buenos Aires: IEEE, 2017. 27–37.

[3] Robillard MP. What makes APIs hard to learn? Answers from developers. IEEE Software, 2009, 26(6): 27–34. [doi: 10.1109/MS.2009.193

[4] Piccioni M, Furia CA, Meyer B. An empirical study of API usability. In: Proc. of the ACM/IEEE Int’l Symp. on Empirical Software Engineering and Measurement. Baltimore: IEEE, 2013. 5–14.

[5] Bruch M, Monperrus M, Mezini M. Learning from examples to improve code completion systems. In: Proc. of the 7th Joint Meeting of the European Software Engineering Conf. and the ACM SIGSOFT Symp. on the Foundations of Software Engineering. Amsterdam: ACM, 2009. 213–222.

[6] Hill R, Rideout J. Automatic method completion. In: Proc. of the 19th IEEE Int’l Conf. on Automated Software Engineering. Linz: IEEE, 2004. 228–235.

[7] Roy CK, Cordy JR, Koschke R. Comparison and evaluation of code clone detection techniques and tools: A qualitative approach. Science of Computer Programming, 2009, 74(7): 470–495. [doi: 10.1016/j.scico.2009.02.007

[8] Rieger M, Ducasse S, Lanza M. Insights into system-wide code duplication. In: Proc. of the 11th Working Conf. on Reverse Engineering. Delft: IEEE, 2004. 100–109.

[9] Proksch S, Lerch J, Mezini M. Intelligent code completion with Bayesian networks. ACM Trans. on Software Engineering and Methodology, 2015, 25(1): 3.

[10] Heinemann L, Hummel B. Recommending API methods based on identifier contexts. In: Proc. of the 3rd Int’l Workshop on Search-Driven Development: Users, Infrastructure, Tools, and Evaluation. Waikiki: ACM, 2011. 1–4.

[11] Nguyen AT, Nguyen HA, Nguyen TT, Nguyen TN. GraPacc: A graph-based pattern-oriented, context-sensitive code completion tool. In: Proc. of the 34th Int’l Conf. on Software Engineering (ICSE). Zurich: IEEE, 2012. 1407–1410.

[12] Akbar RJ, Omori T, Maruyama K. Mining API usage patterns by applying method categorization to improve code completion. IEICE Trans. on Information and Systems, 2014, E97-D(5): 1069–1083. [doi: 10.1587/transinf.E97.D.1069

[13] Asaduzzaman M, Roy CK, Schneider KA, Hou DQ. CSCC: Simple, efficient, context sensitive code completion. In: Proc. of the 2014 IEEE Int’l Conf. on Software Maintenance and Evolution. Victoria: IEEE, 2014. 71–80.

[14] de Souza Amorim LE, Erdweg S, Wachsmuth G, Visser E. Principled syntactic code completion using placeholders. In: Proc. of the 2016 ACM SIGPLAN Int’l Conf. on Software Language Engineering. Amsterdam: ACM, 2016. 163–175.

[15] Hu S, Xiao C, Ishikawa Y. Scope-aware code completion with discriminative modeling. Journal of Information Processing, 2019, 27: 469–478. [doi: 10.2197/ipsjjip.27.469

[16] Nguyen TT, Pham HV, Vu PM, Nguyen TT. Recommending API usages for mobile apps with hidden Markov model. In: Proc. of the 30th IEEE/ACM Int’l Conf. on Automated Software Engineering (ASE). Lincoln: IEEE, 2015. 795–800.

[17] Hellendoorn VJ, Devanbu P. Are deep neural networks the best choice for modeling source code? In: Proc. of the 11th Joint Meeting on Foundations of Software Engineering. Paderborn: IEEE, 2017. 763–773.

[18] Gvero T, Kuncak V, Kuraj I, Piskac R. InSynth: A system for code completion using types and weights. In: Proc. of the Software Engineering & Management, Dresden, 2015. 39–40.

[19] Roos P. Fast and precise statistical code completion. In: Proc. of the IEEE/ACM 37th IEEE Int’l Conf. on Software Engineering. Florence: IEEE, 2015. 757–759.

[20] Savchenko V, Volkov A. Statistical approach to increase source code completion accuracy. In: Proc. of the Ershov Informatics Conf. Moscow: Springer, 2018. 352–363.

[21] Yan JP, Qi Y, Rao QF, He H. Learning API suggestion via single LSTM network with deterministic negative sampling. In: Proc. of the 30th Int’l Conf. on Software Engineering and Knowledge Engineering. San Francisco, 2018.

[22] Svyatkovskiy A, Zhao Y, Fu SY, Sundaresan N. Pythia: AI-assisted code completion system. In: Proc. of the 25th ACM SIGKDD Int’l Conf. on Knowledge Discovery & Data Mining. Anchorage: ACM, 2019. 2727–2735.

[23] Nguyen S, Nguyen T, Li Y, Wang SH. Combining program analysis and statistical language model for code statement completion. In: Proc. of the 34th IEEE/ACM Int’l Conf. on Automated Software Engineering (ASE). San Diego: IEEE, 2019. 710–721.

[24] Chen C, Peng X, Sun J, Xing ZC, Wang X, Zhao YF, Zhang HR, Zhao WY. Generative API usage code recommendation with parameter concretization. Science China Information Sciences, 2019, 62(9): 192103. [doi: 10.1007/s11432-018-9821-9

[25] Hellendoorn VJ, Proksch S, Gall HC, Bacchelli A. When code completion fails: A case study on real-world completions. In: Proc. of the 41st IEEE/ACM Int’l Conf. on Software Engineering (ICSE). Montreal: IEEE, 2019. 960–970.

[26] Yang YX, Chen X, Sun JG. Improve language modeling for code completion through learning general token repetition of source code with optimized memory. Int’l Journal of Software Engineering and Knowledge Engineering, 2019, 29(11–12): 1801–1818. [doi: 10.1142/S0218194019400229

[27] Li J, Wang Y, Lyu MR, King I. Code completion with neural attention and pointer networks. In: Proc. of the 27th Int’l Joint Conf. on Artificial Intelligence. Stockholm: AAAI Press, 2018. 4159–4165.

[28] Terada K, Watanobe Y. Code completion for programming education based on recurrent neural network. In: Proc. of the IEEE 11th Int’l Workshop on Computational Intelligence and Applications (IWCIA). Hiroshima: IEEE, 2019. 109–114.

[29] Yang YX. Improving the robustness to data inconsistency between training and testing for code completion by hierarchical language model. arXiv: 2003.08080, 2020.

[30] Hochreiter S, Schmidhuber J. Long short-term memory. Neural Computation, 1997, 9(8): 1735–1780. [doi: 10.1162/neco.1997.9.8.1735

[31] Chorowski JK, Bahdanau D, Serdyuk D, Cho K, Bengio Y. Attention-based models for speech recognition. In: Proc. of the 28th Int’l Conf. on Neural Information Processing Systems. Montreal: MIT Press, 2015. 577–585.

[32] Vinyals O, Kaiser Ł, Koo T, Petrov S, Sutskever I, Hinton G. Grammar as a foreign language. In: Proc. of the 28th Int’l Conf. on Neural Information Processing Systems. Montreal: MIT Press, 2015. 2773–2781.

[33] Sukhbaatar S, Szlam A, Weston J, Fergus R. End-to-end memory networks. In: Proc. of the 28th Int’l Conf. on Neural Information Processing Systems. Montreal: MIT Press, 2015. 2440–2448.

[34] Zhong H, Mei H. An empirical study on API usages. IEEE Trans. on Software Engineering, 2019, 45(4): 319–334. [doi: 10.1109/TSE.2017.2782280

[35] Mikolov T, Chen K, Corrado G, Dean J. Efficient estimation of word representations in vector space. arXiv: 1301.3781, 2013.

[36] Nguyen AT, Nguyen TN. Graph-based statistical language model for code. In: Proc. of the 37th IEEE/ACM Int’l Conf. on Software Engineering. Florence: IEEE, 2015. 858–868.

[37] Nguyen AT, Hilton M, Codoban M, Nguyen HA, Mast L, Rademacher E, Nguyen TN, Dig D. API code recommendation using statistical learning from fine-grained changes. In: Proc. of the 24th ACM SIGSOFT Int’l Symp. on Foundations of Software Engineering. Seattle: ACM, 2016. 511–522.

[38] Hindle A, Barr ET, Su ZD, Gabel M, Devanbu P. On the naturalness of software. In: Proc. of the 34th Int’l Conf. on Software Engineering (ICSE). Zurich: IEEE, 2012. 837–847.

[39] Dam HK, Tran T, Pham T. A deep language model for software code. arXiv: 1608.02715, 2016.

引用本文

唐泽,李传艺,葛季栋,骆斌.基于对象类型的API补全方法.软件学报,2022,33(5):1736-1757

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2021-08-11
最后修改日期:2021-10-09
录用日期:
在线发布日期: 2022-01-28
出版日期: 2022-05-06

微信服务号

微信订阅号

引用本文

分享

文章指标

历史

文章二维码

微信服务号

微信订阅号

引用本文

分享

微信扫一扫：分享

文章指标

历史

文章二维码