Program Comprehension: Present and Future
Author:
Affiliation:

Fund Project:

National Basic Research Program of China (973) (2015CB352201); National Natural Science Foundation of China (61620106007, 61751210)

  • Article
  • | |
  • Metrics
  • |
  • Reference [103]
  • |
  • Related [20]
  • | | |
  • Comments
    Abstract:

    Program comprehension is a key activity in software engineering and plays an important role in software development, software maintenance, and software reuse. Since the advent of software engineering, program comprehension has always been a hot research hotspot issue in this field. With the increasing complexity and popularity of software, the needs for program comprehension have been changed. The program self-understanding and self-awareness have gradually become new focuses. Therefore, it is highly desired to re-examine the purposes, the tasks and the techniques of program comprehension. Firstly, this paper dicsusses the program comprehension from the 3 perspectives, namely, the engineering, the learning cognition, as well as the techniques. Then, it shows the degree of research attentions through literature analysis. Furthermore, it discusses the research progress from three aspects, i.e., the cognitive process, the methods and techniques, and the software engineering tasks. Finally, it discusses the development trend and challengs.

    Reference
    [1] Storey MA. Theories, methods and tools in program comprehension:Past, present and future. In:Proc. of the 13th Int'l Workshop on Program Comprehension (IWPC). 2005. 181-191.[doi:10.1109/wpc.2005.38]
    [2] Rugaber S. Program comprehension for reverse engineering. In:Proc. of the AAAI Workshop on AI and Automated Program Understanding. 1992. 106-110.
    [3] Nielson F, Nielson HR, Hankin C. Principles of Program Analysis. Springer-Verlag, 2015. 1-3.[doi:10.1007/978-3-662-03811-6]
    [4] Kirkov R, Agre G. Source code analysis-An overview. Cybernetics and Information Technologies, 2010,10(2):60-77.
    [5] Ball T. The concept of dynamic analysis. ACM SIGSOFT Software Engineering Notes, 1999,24(6):216-234.
    [6] Deimel L, Naveda J. Reading computer programs:Instructor's guide and exercises educational materials CMU. In:Proc. of the SEI-90-EM, Vol.3. 1990. 9-11.[doi:10.21236/ada228026]
    [7] Müller HA, Tilley SR, Wong K. Understanding software systems using reverse engineering technology perspectives from the RIGI project. In:Proc. of the Conf. of the Centre for Advanced Studies on Collaborative Research:Software Engineering, Vol.1. IBM Press, 1993. 217-226.[doi:10.1142/9789812831163_0016]
    [8] O'brien MP. Software comprehension-A review & research direction. Technical Report, Ireland:Department of Computer Science & Information Systems University of Limerick, 2003. 1-2.
    [9] Cornelissen B, Zaidman A, Van Deursen A, Moonen L, Koschke R. A systematic survey of program comprehension through dynamic analysis. IEEE Trans. on Software Engineering, 2009,35(5):684-702.[doi:10.1109/tse.2009.28]
    [10] Maalej W, Tiarks R, Roehm T, Koschke R. On the comprehension of program comprehension. ACM Trans. on Software Engineering and Methodology (TOSEM), 2014,23(4):31.
    [11] Armaly A, Rodeghero P, McMillan C. A comparison of program comprehension strategies by blind and sighted programmers. In:Proc. of the 40th Int'l Conf. on Software Engineering (ICSE). 2017. 788.[doi:10.1145/3180155.3182544]
    [12] Hughes J, Sparks C, Stoughton A, Parikh R, Reuther A, Jagannathan S. Building resource adaptive software systems (BRASS):Objectives and system evaluation. ACM SIGSOFT Software Engineering Notes, 2016,41(1):1-2.[doi:10.1145/2853073.2853081]
    [13] Sifakis J. System design in the era of IoT-Meeting the autonomy challenge. arXiv Preprint arXiv:1806.09846, 2018.[doi:10.4204/eptcs.272.1]
    [14] Shneiderman B, Mayer R. Syntactic/Semantic interactions in programmer behavior:A model and experimental results. Int'l Journal of Computer & Information Sciences, 1979,8(3):219-238.[doi:10.1007/bf00977789]
    [15] Pennington N. Stimulus structures and mental representations in expert. Comprehension of Computer Programs, 1987,19(3):295-341.
    [16] Von Mayrhauser A, Vans AM. From program comprehension to tool requirements for an industrial environment. In:Proc. of the 2nd Int'l Workshop on Program Comprehension (IWPC). 1993. 78-86.[doi:10.1109/wpc.1993.263903]
    [17] Shaft TM, Vessey I. The relevance of application domain knowledge:The case of computer program comprehension. Information Systems Research, 1995,6(3):286-299.[doi:10.1287/isre.6.3.286]
    [18] Shneiderman B. Exploratory experiments in programmer behavior. Int'l Journal of Computer & Information Sciences, 1976,5(2):123-143.[doi:10.1007/bf00975629]
    [19] Soloway E, Ehrlich K. Empirical studies of programming knowledge. IEEE Trans. on Software Engineering, 1984,10(5):595-609.[doi:10.1109/tse.1984.5010283]
    [20] Siegmund J. Program comprehension:Past, present, and future. In:Proc. of the 23rd IEEE Int'l Conf. on Software Analysis, Evolution, and Reengineering (SANER). 2016. 13-20.
    [21] Siegmund J, Kästner C, Apel S, Parnin C, Bethmann A, Leich T, Saake G, Brechmann A. Understanding understanding source code with functional magnetic resonance imaging. In:Proc. of the 36th Int'l Conf. on Software Engineering (ICSE). 2014. 378-389.[doi:10.1145/2568225.2568252]
    [22] Nakagawa T, Kamei Y, Uwano H, Monden A, Matsumoto K, German DM. Quantifying programmers' mental workload during program comprehension based on cerebral blood flow measurement:A controlled experiment. In:Companion Proc. of the 36th Int'l Conf. on Software Engineering. 2014. 448-451.[doi:10.1145/2591062.2591098]
    [23] Fritz T, Begel A, Müller SC, Yigit-Elliott S, Züger M. Using psycho-physiological measures to assess task difficulty in software development. In:Proc. of the 36th Int'l Conf. on Software Engineering (ICSE). 2014. 402-413.[doi:10.1145/2568225.2568266]
    [24] Mei H, Wang QX, Zhang L, Wang J. Software analysis:A road map. Chinese Journal of Computers, 2009,32(9):1697-1710(in Chinese with English abstract).
    [25] Fagan M. Design and code inspections to reduce errors in program development. In:Proc. of the Software Pioneers. Berlin, Heidelberg:Springer-Verlag, 2002. 575-607.[doi:10.1007/978-3-642-59412-0_35]
    [26] Baxter ID, Yahin A, Moura L, Sant'Anna M, Bier L. Clone detection using abstract syntax trees. In:Proc. of the 1998 Int'l Conf. on Software Maintenance (ICSM). 1998. 368-377.[doi:10.1109/icsm.1998.738528]
    [27] King JC. Symbolic execution and program testing. Communications of the ACM, 1976,19(7):385-394.[doi:10.1145/360248. 360252]
    [28] Mock M. Dynamic analysis from the bottom up. In:Proc. of the WODA 2003 ICSE Workshop on Dynamic Analysis. 2003. 13.
    [29] Larus JR, Ball T. Rewriting executable files to measure program behavior. Software:Practice and Experience, 1994,24(2):197-218.[doi:10.1002/spe.4380240204]
    [30] Gosain A, Sharma G. A survey of dynamic program analysis techniques and tools. In:Proc. of the 3rd Int'l Conf. on Frontiers of Intelligent Computing:Theory and Applications (FICTA). 2015. 113-122.
    [31] Allamanis M, Barr ET, Devanbu P, Sutton C. A survey of machine learning for big code and naturalness. arXiv Preprint arXiv:1709.06182, 2017.[doi:10.1145/3212695]
    [32] Hindle A, Barr ET, Su Z, Gabel M, Devanbu P. On the naturalness of software. In:Proc. of the 34th Int'l Conf. on Software Engineering (ICSE). 2012. 837-847.[doi:10.1109/icse.2012.6227135]
    [33] Nguyen TT, Nguyen AT, Nguyen HA, Nguyen TN. A statistical semantic language model for source code. In:Proc. of the 9th Joint Meeting on Foundations of Software Engineering (FSE). 2013. 532-542.[doi:10.1145/2491411.2491458]
    [34] Tu Z, Su Z, Devanbu P. On the localness of software. In:Proc. of the 22nd ACM SIGSOFT Int'l Symp. on Foundations of Software Engineering (FSE). 2014. 269-280.[doi:10.1145/2635868.2635875]
    [35] Lafferty J, McCallum A, Pereira FC. Conditional random fields:Probabilistic models for segmenting and labeling sequence data. In:Proc. of the 8th Int'l Conf. on Machine Learning (ICML). 2001. 282-289.
    [36] Raychev V, Vechev M, Krause A. Predicting program properties from big code. ACM SIGPLAN Notices, 2015,50(1):111-124.[doi:10.1145/2775051.2677009]
    [37] Raychev V, Bielik P, Vechev M. Probabilistic model for code with decision trees. ACM SIGPLAN Notices, 2016,51(10):731-747.[doi:10.1145/3022671.2984041]
    [38] Raychev V, Vechev M, Yahav E. Code completion with statistical language models. ACM SIGPLAN Notices, 2014,49(6):419-428.[doi:10.1145/2666356.2594321]
    [39] Bhoopchand A, Rocktäschel T, Barr E, Riedel S. Learning python code suggestion with a sparse pointer network. arXiv Preprint arXiv:1611.08307, 2016.
    [40] Iyer S, Konstas I, Cheung A, Zettlemoyer L. Summarizing source code using a neural attention model. In:Proc. of the 54th Annual Meeting of the Association for Computational Linguistics (Vol.1:Long Papers). 2016. 2073-2083.[doi:10.18653/v1/p16-1195]
    [41] Gu X, Zhang H, Kim S. Deep code search. In:Proc. of the 40th Int'l Conf. on Software Engineering (ICSE). 2018. 933-944.[doi:10.1145/3180155.3180167]
    [42] Hu X, Li G, Xia X, Lo D, Lu S, Jin Z. Summarizing source code with transferred API knowledge. In:Proc. of the 27th Int'l Joint Conf. on Artificial Intelligence (IJCAI). 2018. 2269-2275.[doi:10.24963/ijcai.2018/314]
    [43] Li J, Wang Y, King I, Lyu MR. Code completion with neural attention and pointer networks. arXiv Preprint arXiv:1711.09573, 2017.[doi:10.24963/ijcai.2018/578]
    [44] Yin P, Neubig G. A syntactic neural model for general-purpose code generation. arXiv Preprint arXiv:1704.01696, 2017.[doi:10.18653/v1/p17-1041]
    [45] Rabinovich M, Stern M, Klein D. Abstract syntax networks for code generation and semantic parsing. arXiv Preprint arXiv:1704.07535, 2017.[doi:10.18653/v1/p17-1105]
    [46] Wei HH, Li M. Supervised deep features for software functional clone detection by exploiting lexical and syntactical information in source code. In:Proc. of the 26th Int'l Joint Conf. on Artificial Intelligence (IJCAI). 2017. 3034-3040.[doi:10.24963/ijcai.2017/423]
    [47] Li Y, Tarlow D, Brockschmidt M, Zemel R. Gated graph sequence neural networks. arXiv Preprint arXiv:1511.05493, 2015.
    [48] Allamanis M, Brockschmidt M, Khademi M. Learning to represent programs with graphs. arXiv Preprint arXiv:1711.00740, 2017.
    [49] Reed S, De Freitas N. Neural programmer-interpreters. arXiv preprint arXiv:1511.06279, 2017.
    [50] Cai J, Shin R, Song D. Making neural programming architectures generalize via recursion. arXiv Preprint arXiv:1704.06611, 2017.
    [51] Balog M, Gaunt AL, Brockschmidt M, Nowozin S, Tarlow D. Deepcoder:Learning to write programs. arXiv Preprint arXiv:1611.01989, 2016.
    [52] Wang K, Singh R, Su Z. Dynamic neural program embedding for program repair. arXiv Preprint arXiv:1711.07163, 2017.
    [53] Boehm BW. Software Engineering Economics. Englewood Cliffs (NJ):Prentice-Hall, 1981.
    [54] Poshyvanyk D, Gethers M, Marcus A. Concept location using formal concept analysis and information retrieval. ACM Trans. on Software Engineering and Methodology (TOSEM), 2012,21(4):23.[doi:10.1109/icpc.2007.13]
    [55] Koschke R, Quante J. On dynamic feature location. In:Proc. of the 20th IEEE/ACM Int'l Conf. on Automated Software Engineering. ACM, 2005. 86-95.[doi:10.1145/1101908.1101923]
    [56] Snelting G, Tip F. Understanding class hierarchies using concept analysis. ACM Trans. on Programming Languages and Systems (TOPLAS), 2000,22(3):540-582.[doi:10.1145/353926.353940]
    [57] Grubb P, Takang AA. Software maintenance:Concepts and practice. In:Proc. of the World Scientific. 2003.[doi:10.1142/9789812564429]
    [58] Glorie M, Zaidman A, Van Deursen A, Hofland L. Splitting a large software repository for easing future software evolution-An industrial experience report. Journal of Software Maintenance and Evolution:Research and Practice, 2009,21(2):113-141.[doi:10.1109/csmr.2008.4493310]
    [59] Gîrba T, Ducasse S, Kuhn A, Marinescu R, Daniel R. Using concept analysis to detect co-change patterns. In:Proc. of the 9th Int'l Workshop on Principles of Software Evolution. 2007. 83-89.[doi:10.1145/1294948.1294970]
    [60] Runeson P, Andersson C, Höst M. Test processes in software product evolution-A qualitative survey on the state of practice. Journal of Software Maintenance and Evolution:Research and Practice, 2003,15(1):41-59.[doi:10.1002/smr.265]
    [61] Freedman RS. Testability of software components. IEEE Trans. on Software Engineering, 1991,17(6):553-564.[doi:10.1109/32. 87281]
    [62] Cellier P, Ducassé M, Ferré S, Ridoux O. Formal concept analysis enhances fault localization in software. In:Proc. of the Int'l Conf. on Formal Concept Analysis. 2008. 273-288.[doi:10.1007/978-3-540-78137-0_20]
    [63] Ammons G, Mandelin D, Bodík R, Larus JR. Debugging temporal specifications with concept analysis. ACM SIGPLAN Notices, 2003,38(5):182-195.[doi:10.1145/780822.781152]
    [64] Prowell SJ, Poore JH. Foundations of sequence-based software specification. IEEE Trans. on Software Engineering, 2003,29(5):417-429.[doi:10.1109/tse.2003.1199071]
    [65] Mens T. A survey of software refactoring. IEEE Trans. on Software Engineering, 2004,30(2):126-139.[doi:10.1109/tse.2004. 1265817]
    [66] Arévalo G, Ducasse S, Gordillo S, Nierstrasz O. Generating a catalog of unanticipated schemas in class hierarchies using formal concept analysis. Information and Software Technology, 2010,52(11):1167-1187.[doi:10.1016/j.infsof.2010.05.010]
    [67] Snelting G, Tip F. Reengineering class hierarchies using concept analysis. ACM SIGSOFT Software Engineering Notes, 1998,23(6):99-110.[doi:10.1145/291252.288273]
    [68] Bhatti MU, Ducasse S, Huchard M. Reconsidering classes in procedural object-oriented code. In:Proc. of the 15th Working Conf. on Reverse Engineering (WCRE). 2008. 257-266.[doi:10.1109/wcre.2008.58]
    [69] Al-Ekram R, Kontogiannis K. Source code modularization using lattice of concept slices. In:Proc. of the 8th European Conf. on Software Maintenance and Reengineering (CSMR). 2004. 195-203.[doi:10.1109/csmr.2004.1281420]
    [70] Kim HH, Bae DH. Object-Oriented concept analysis for software modularisation. IET Software, 2008,2(2):134-148.[doi:10.1049/iet-sen:20060069]
    [71] Li B, Sun X, Leung H, Zhang S. A survey of code-based change impact analysis techniques. Software Testing, Verification and Reliability, 2013,23(8):613-646.[doi:10.1002/stvr.1475]
    [72] Tonella P. Using a concept lattice of decomposition slices for program understanding and impact analysis. IEEE Trans. on Software Engineering, 2003,29(6):495-509.[doi:10.1109/tse.2003.1205178]
    [73] Tonella P, Antoniol G. Inference of object-oriented design patterns. Journal of Software Maintenance and Evolution:Research and Practice, 2001,13(5):309-330.
    [74] Hill E, Pollock L, Vijay-Shanker K. Automatically capturing source code context of NL-queries for software maintenance and reuse. In:Proc. of the 31st Int'l Conf. on Software Engineering (ICSE). 2009. 232-242.[doi:10.1109/icse.2009.5070524]
    [75] Vinz BL, Etzkorn LH. Improving program comprehension by combining code understanding with comment understanding. Knowledge-Based Systems, 2008,21(8):813-825.[doi:10.1016/j.knosys.2008.03.033]
    [76] Falleri JR, Huchard M, Lafourcade M, Nebut C, Prince V, Dao M. Automatic extraction of a wordnet-like identifier network from software. In:Proc. of the 18th Int'l Conf. on Program Comprehension (ICPC). 2010. 4-13.[doi:10.1109/icpc.2010.12]
    [77] Nguyen TT, Nguyen AT, Nguyen HA, Nguyen TN. A statistical semantic language model for source code. In:Proc. of the 9th Joint Meeting on Foundations of Software Engineering (FSE). 2013. 532-542.[doi:10.1145/2491411.2491458]
    [78] Fry ZP, Shepherd D, Hill E, Pollock L, Vijay-Shanker K. Analysing source code:Looking for useful verb-Direct object pairs in all the right places. IET Software, 2008,2(1):27-36.[doi:10.1049/iet-sen:20070112]
    [79] Robillard MP, Murphy GC. Representing concerns in source code. ACM Trans. on Software Engineering and Methodology (TOSEM), 2007,16(1):3.[doi:10.1145/1189748.1189751]
    [80] Robillard MP, Shepherd D, Hill E, Vijay-Shanker K, Pollock L. An empirical study of the concept assignment problem. Technical Report, SOCS-TR-2007.3, School of Computer Science, McGill University, 2007. 1-4.
    [81] Dit B, Revelle M, Gethers M, Poshyvanyk D. Feature location in source code:A taxonomy and survey. Journal of Software:Evolution and Process, 2013,25(1):53-95.[doi:10.1002/smr.567]
    [82] Zhao W, Zhang L, Liu Y, Sun J, Yang F. SNIAFL:Towards a static noninteractive approach to feature location. ACM Trans. on Software Engineering and Methodology (TOSEM), 2006,15(2):195-226.[doi:10.1145/1131421.1131424]
    [83] Chen K, Rajlich V. Case study of feature location using dependence graph. In:Proc. of the 8th Int'l Workshop on Program Comprehension (IWPC). 2000. 241-247.[doi:10.1109/icpc.2010.40]
    [84] Poshyvanyk D, Gueheneuc YG, Marcus A, Antoniol G, Rajlich V. Feature location using probabilistic ranking of methods based on execution scenarios and information retrieval. IEEE Trans. on Software Engineering, 2007,33(6):420-432.[doi:10.1109/tse.2007. 1016]
    [85] Ivkovic I, Kontogiannis K. Towards automatic establishment of model dependencies using formal concept analysis. Int'l Journal of Software Engineering and Knowledge Engineering, 2006,16(4):499-522.[doi:10.1142/s0218194006002902]
    [86] Niu N, Easterbrook S. Concept analysis for product line requirements. In:Proc. of the 8th ACM Int'l Conf. on Aspect-Oriented Software Development. 2009. 137-148.[doi:10.1145/1509239.1509259]
    [87] Pinzger M, Gall H. Pattern-Supported architecture recovery. In:Proc. of the 10th Int'l Workshop on Program Comprehension (IWPC). 2002. 53-61.[doi:10.1109/wpc.2002.1021318]
    [88] Müller HA, Jahnke JH, Smith DB, Storey MA, Tilley SR, Wong K. Reverse engineering:A roadmap. In:Proc. of the Conf. on the Future of Software Engineering. 2000. 47-60.[doi:10.1145/336512.336526]
    [89] Harandi MT, Ning JQ. Knowledge-Based program analysis. IEEE Software, 1990,7(1):74-81.[doi:10.1109/icsm.1988.10182]
    [90] Viljamaa J. Reverse engineering framework reuse interfaces. ACM SIGSOFT Software Engineering Notes, 2003,28(5):217-226.[doi:10.1145/949952.940101]
    [91] Marcus A, Sergeyev A, Rajlich V, Maletic JI. An information retrieval approach to concept location in source code. In:Proc. of the 11th Working Conf. on Reverse Engineering. 2004. 214-223.[doi:10.1109/wcre.2004.10]
    [92] Carey MM, Gannod GC. Recovering concepts from source code with automated concept identification. In:Proc. of the 15th IEEE Int'l Conf. on Program Comprehension (ICPC). 2007. 27-36.[doi:10.1109/icpc.2007.31]
    [93] Amann S, Proksch S, Nadi S, Mezini M. A study of visual studio usage in practice. In:Proc. of the 23rd IEEE Int'l Conf. on Software Analysis, Evolution, and Reengineering (SANER). 2016. 124-134.[doi:10.1109/saner.2016.39]
    [94] Moreno L, Aponte J, Sridhara G, Marcus A, Pollock L, Vijay-Shanker K. Automatic generation of natural language summaries for Java classes. In:Proc. of the 21st IEEE Int'l Conf. on Program Comprehension (ICPC). 2013. 23-32.
    [95] McBurney PW, McMillan C. Automatic source code summarization of context for Java methods. IEEE Trans. on Software Engineering, 2016,42(2):103-119.[doi:10.1109/tse.2015.2465386]
    [96] Mou L, Li G, Zhang L, Wang T, Jin Z. Convolutional neural networks over tree structures for programming language. In:Proc. of the 30th AAAI Conf. on Artificial Intelligence (AAAI). 2016. 1287-1293.
    [97] White M, Tufano M, Vendome C, Poshyvanyk D. Deep learning code fragments for code clone detection. In:Proc. of the 31st IEEE/ACM Int'l Conf. on Automated Software Engineering (ASE). 2016. 87-98.
    [98] Wang S, Liu T, Tan L. Automatically learning semantic features for defect prediction. In:Proc. of the 38th Int'l Conf. on Software Engineering (ICSE). 2016. 297-308.[doi:10.1145/2884781.2884804]
    [99] Murali V, Chaudhuri S, Jermaine C. Finding likely errors with bayesian specifications. arXiv Preprint arXiv:1703.01370, 2017.
    [100] Allamanis M, Barr ET, Bird C, Sutton C. Learning natural coding conventions. In:Proc. of the 22nd ACM SIGSOFT Int'l Symp. on Foundations of Software Engineering (FSE). 2014. 281-293.[doi:10.1145/2635868.2635883]
    [101] Allamanis M, Peng H, Sutton C. A convolutional attention network for extreme summarization of source code. In:Proc. of the Int'l Conf. on Machine Learning (ICML). 2016. 2091-2100.
    附中文参考文献
    [24] 梅宏,王千祥,张路,王戟.软件分析技术进展.计算机学报,2009,32(9):1697-1710.
    Cited by
    Comments
    Comments
    分享到微博
    Submit
Get Citation

金芝,刘芳,李戈.程序理解:现状与未来.软件学报,2019,30(1):110-126

Copy
Share
Article Metrics
  • Abstract:6201
  • PDF: 9327
  • HTML: 4395
  • Cited by: 0
History
  • Received:August 08,2018
  • Revised:August 30,2018
  • Online: November 22,2018
You are the first2049631Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063