面向顺序存储结构的数据流分析
作者:
作者简介:

王淑栋(1973-),女,山东青岛人,博士,教授,博士生导师,主要研究领域为生物计算,软件工程;尹文静(1995-),女,硕士生,CCF学生会员,主要研究领域为数据流分析,缺陷检测;董玉坤(1981-),男,博士,讲师,CCF专业会员,主要研究领域为缺陷检测,程序自动修复;张莉(1994-),女,硕士生,CCF学生会员,主要研究领域为程序自动修复,软件工程;刘浩(1992-),男,硕士生,主要研究领域为缺陷警报关联,数据流分析.

通讯作者:

董玉坤,E-mail:dongyk@upc.edu.cn

基金项目:

中央高校基本科研业务费专项资金(19CX02028A);国家自然科学基金(61873281)


Data Flow Analysis for Sequential Storage Structures
Author:
Fund Project:

Fundamental Research Funds for the Central Universities (19CX02028A); National Natural Science Foundation of China (61873281)

  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [37]
  • |
  • 相似文献 [20]
  • | | |
  • 文章评论
    摘要:

    C程序中数组、malloc动态分配后的连续内存等顺序存储结构被大量使用,但大多数传统的数据流分析方法未能充分描述其结构及其上的操作,特别是在利用指针访问顺序存储结构时,传统的分析方法只关注了指针的指向关系,而未讨论指针可能发生偏移的数值信息,且未考虑发生偏移时可能存在越界的不安全问题,导致了对顺序存储结构分析不精确.针对以上不足,首先对顺序存储结构进行抽象建模,并对顺序存储结构与指针结合使用时的指向关系与偏移量进行有效表示,建立了用于顺序存储结构的抽象内存模型SeqMM;其次,归纳总结C程序中顺序存储结构涉及的指针相关迁移操作、谓词操作及遍历顺序存储结构的循环操作,提出了安全范围判别保证操作安全性;之后,针对函数调用时形参指针引用顺序存储结构与实参的映射过程进行过程间推导规则设计;最后,基于上述分析,提出了一种内存泄漏缺陷检测算法,对5个开源C工程的内存泄漏缺陷进行检测.实验结果表明,所提出的SeqMM能够有效地刻画C程序中的顺序存储结构及其涉及的各种操作,其数据流分析结果能够用于内存泄漏的检测工作,同时在效率和精度之间取得合理的权衡.

    Abstract:

    Sequential storage structures such as array and continuous memory block allocated dynamically by malloc are widely used in C programs. But traditional data flow analysis fails to adequately describe their structures and operations. When pointers are used to access the sequential storage structures in C programs, existing data flow analysis methods basically pay attention to only points-to information and do not consider the numerical properties offset. In addition, it does not consider the unsafe problem caused by out of bounds when offset occurs, which leads to inaccurate analysis for sequential storage structure. To improve the precision for analyzing sequential storage structures, an abstract memory model SeqMM is proposed to describe sequential storage structures, which can effectively describe points-to relationships and offset. Then there are three operations are summarized, such as the pointer-related transfer operation, predicate operation, and loop operation traversing sequential storage structures, and it is also considered that whether the index is out of bounds to ensure the security of operation execution when analyzing these operations. After that, mapping rules are introduced for parameters referencing sequential storage structure to corresponding arguments. Finally, a memory leak detection algorithm is proposed to detect memory leak in 5 open-source projects. The experimental results indicate that SeqMM can effectively describe sequential storage structure and various operations in C programs, and the results of data flow analysis can be used to detect memory leaks when a reasonable balance between accuracy and efficiency occurs.

    参考文献
    [1] Dong YK, Jin DH, Gong YZ, Xing Y. Static analysis of C programs via region-based memory model. Ruan Jian Xue Bao/Journal of Software, 2014,25(2):357-372(in Chinese with English abstract). http://www.jos.org.cn/1000-9825/4532.htm [doi: 10.13328/ j.cnki.jos.004532]
    [2] James CK. Symbolic execution and program testing. Communications of the ACM, 1976,19(7):385-394.
    [3] Xu ZX, Kremenek T, Zhang J. A memory model for static analysis of C programs. In: Proc. of the Int’l Conf. on Leveraging Applications of Formal Methods. 2010. 535-548.
    [4] Zhang J. Symbolic execution of program paths involving pointer structure variables. In: Proc. of the Int’l Conf. on Quality Software. 2004. 87-92.
    [5] Hackett B, Rugina R. Region-based shape analysis with tracked locations. ACM SIGPLAN Notices, 2005,40(1):310-323. [doi: 10.1145/1040305.1040331]
    [6] Dong LM, Wang J, Chen LQ, Liu JC. Field-sensitive memory model for memory safety of heap-manipulating programs. Computer Science, 2012,39(9):109-114(in Chinese with English abstract). [doi: CNKI:SUN:JSJA.0.2012-09-028]
    [7] Zhao YS, Wang YW, Gong YZ, Chen HH, Xiao Q, Yang ZH. STVL: Improve the precision of static defect detection with symbolic three-values logic. In: Proc. of the 18th Asia Pacific Software Engineering Conf. 2011. 179-186. [doi: 10.1109/APSEC.2011.23]
    [8] Yin BH, Chen LQ, Wang J. Analysis of program with pointer arithmetic by combining points to and numer. Computer Science, 2015,42(7):32-37(in Chinese with English abstract). [doi: 10.11896/j.issn.1002-137X.2015.7.008]
    [9] Steensgaard B. Points-to analysis in almost linear time. In: Proc. of the ACM SIGPLAN-SIGACT Symp. on Principles of Programming Languages. 1996. 32-41. [doi: 10.1145/237721.237727]
    [10] Andersen LO. Program analysis and specialization for the C programming language. University of Cophenhagen, 1994.
    [11] Hackett B, Aiken A. How is aliasing used in systems software? In: Proc. of the 14th ACM SIGSOFT Int’l Symp. on Foundations of Software Engineering. 2006. 69-80. [doi: 10.1145/1181775.1181785]
    [12] Kang HG, Han T. A bottom-up pointer analysis using the update history. Information and Software Technology, 2009,51(4): 691-707. [doi: 10.1016/j.infsof.2008.11.003]
    [13] Yu HT, Xue JL, Huo W, Feng XB, Zhang ZQ. Level by level: Making flow- and context-sensitive pointer analysis scalable for millions of lines of code. In: Proc. of the 8th Int’l Symp. on Code Generation and Optimization. 2010. 218-229. [doi: 10.1145/ 1772954.1772985]
    [14] De A, D’Souza D. Scalable flow-sensitive pointer analysis for java with strong updates. In: Proc. of the 26th European Conf. on Object-oriented Programming. 2012. 665-687. [doi: 10.1007/978-3-642-31057-7-29]
    [15] Feng Y, Wang XY, Dillig I, Dillig T. Bottom-up context-sensitive pointer analysis for Java. In: Proc. of the 13th Asian Symp.on Programming Languages and Systems. 2015. 465-484. [doi: 10.1007/978-3-319-26529-2_25]
    [16] Sui YL, Xue JL. On-demand strong update analysis via value-flow refinement. In: Proc. of the 24th ACM SIGSOFT Int’l Symp. on Foundations of Software Engineering. 2016. 460-473. [doi: 10.1145/2950290.2950296]
    [17] Bush WR, Pincus JD, Sielaff DJ. A static analyzer for finding dynamic programming errors. Software Practice and Experience, 2001,30(7):775-802. [doi: 10.1002/(SICI)1097-024X(200006)30:7<775::AID-SPE309>3.0.CO]
    [18] Dillig I, Dillig T, Aiken A. Sound, complete and scalable path-sensitive analysis. ACM SIGPLAN Notices, 2008,43(6):270-280. [doi: 10.1145/1379022.1375615]
    [19] Gutzmann T, Lundberg J, Lowe W. Towards path-sensitive points-to analysis. In: Proc. of the 7th IEEE Int’l Working Conf. on Source Code Analysis and Manipulation. 2007. 59-68. [doi: 10.1109/SCAM.2007.26]
    [20] Sui YL, Ye S, Xue JL, Yew PC. SPAS: Scalable path-sensitive pointer analysis on full-sparse SSA. In: Proc. of the 9th Asian Symp. on Programming Languages and Systems. 2011. 155-171. [doi: 10.1007/978-3-642-25318-8_14]
    [21] Pathade K, Khedker UP. Computing partially path-sensitive MFP solution in data flow analysis. In: Proc. of the 27th Int’l Conf. on Compiler Construction. 2018. 37-47. [doi: 10.1145/3178372.3179497]
    [22] Litvak S, Dor N, Bodik R, Rinetzky N, Sagiv M. Field-sensitive program dependence analysis. In: Proc. of the 18th ACM SIGSOFT Int’l Symp. on Foundations of Software Engineering. 2010. 287-296. [doi: 10.1145/1882291.1882334]
    [23] Cousot P, Cousot R. Comparing the Galois connection and widening/narrowing approaches to abstract interpretation. In: Proc. of the 4th Int’l Symp. on Programming Language Implementation and Logic Programming. 1992. 269-296. [doi: 10.1007/3-540- 55844-6_142]
    [24] Šor V. Statistical approach for memory leak dectection in Java applications [Ph.D. Thesis]. Tartu: University of Tartu, 2015.
    [25] Šor V, Oü P, Treier T, Srirama SN. Improving statistical approach for memory leak detection using machine learning. In: Proc. of the 29th IEEE Int’l Conf. on Software Maintenance. 2013. 544-547. [doi: 10.1109/ICSM.2013.92]
    [26] Šor V, Srirama SN, Salnikov-Tarnovski N. Memory leak detection in Plumbr. Software: Practice and Experience, 2015,45(10): 1307-1330. [doi: 10.1002/spe.2275]
    [27] Andrzejak A, Eichler F, Ghanavati M. Defect of memory leaks in C/C++ code via machine learning. In: Proc. of the 28th IEEE Int’l Symp. on Software Reliability Engineering Workshops. 2017. 252-258. [doi: 10.1109/ISSREW.2017.72]
    [28] Zhu YW, Zuo ZQ, Wang LZ, Li XD. Memory leak intelligent detection method for C programs. Ruan Jian Xue Bao/Journal of Software, 2019,30(5):1330-1341(in Chainexe with English abstract). http://www.jos.org.cn/1000-9825/5715.htm [doi: 10.13328/j. cnki.jos.005715]
    [29] Li X, Zhou Y, Li MC, Chen YJ, Xu GQ, Wang LZ, Li XD. Automatically validating static memory leak warnings for C/C++ programs. Ruan Jian Xue Bao/Journal of Software, 2017,28(4):827-844(in Chinese with English abstract). http://www.jos.org.cn/1000-9825/5189htm [doi: 10.13328/j.cnki.jos.005189]
    [30] Yang ZH, Gong YZ, Xiao Q, Wang YW. A defect model based testing system. Journal of Beijing University of Posts and Telecommunications, 2008,31(5):1-4(in Chinese with English abstract).
    附中文参考文献:
    [1] 董玉坤,金大海,宫云战,邢颖.基于区域内存模型的C程序静态分析.软件学报,2014,25(2):357-372. http://www.jos.org.cn/1000-9825/4532.htm [doi: 10.13328/j.cnki.jos.004532]
    [6] 董龙明,王戟,陈立前,刘江潮.一种面向堆操作程序内存安全性的域敏感内存模型.计算机科学,2012,39(9):109-114. [doi: CNKI: SUN:JSJA.0.2012-09-028]
    [8] 尹帮虎,陈立前,王戟.基于指向与数值抽象的带指针算术程序的分析方法.计算机科学,2015,42(7):32-37. [doi: 10.11896/j.issn. 1002-137X.2015.7.008]
    [28] 朱亚伟,左志强,王林章,李宣东.C程序内存泄漏智能化检测方法.软件学报,2019,30(5):1330-1341. http://www.jos.org.cn/1000-9825/5715.htm [doi: 10.13328/j.cnki.jos.005715]
    [29] 李筱,周严,李孟宸,陈园军,Xu GQ,王林章,李宣东.C/C++程序静态内存泄漏警报自动确认方法.软件学报,2017,28(4): 827-844. http://www.jos.org.cn/1000-9825/5189.htm [doi: 10.13328/j.cnki.jos.005189]
    [30] 杨朝红,宫云战,肖庆,王雅文.基于软件缺陷模型的测试系统.北京邮电大学学报,2008,31(5):1-4.
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

王淑栋,尹文静,董玉坤,张莉,刘浩.面向顺序存储结构的数据流分析.软件学报,2020,31(5):1276-1293

复制
分享
文章指标
  • 点击次数:2586
  • 下载次数: 6070
  • HTML阅读次数: 3262
  • 引用次数: 0
历史
  • 收稿日期:2019-08-31
  • 最后修改日期:2019-10-24
  • 在线发布日期: 2020-04-09
  • 出版日期: 2020-05-06
文章二维码
您是第19921105位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号