基于污点分析的数组越界缺陷的静态检测方法
作者:
作者简介:

高凤娟(1991-),女,学士,主要研究领域为软件工程,程序分析,软件测试,软件安全.
王豫(1991-),男,学士,主要研究领域为软件工程,程序分析,软件测试,软件安全.
陈天骄(1992-),男,硕士,主要研究领域为软件工程.
司徒凌云(1988-),男,博士,助理研究员,CCF学生会员,主要研究领域为软件工程,信息安全,静态分析,模糊测试.
王林章(1973-),男,博士,教授,博士生导师,CCF杰出会员,主要研究领域为软件工程,软件测试,软件安全.
李宣东(1963-),男,博士,博士生导师,CCF会士,主要研究领域为复杂软件建模与分析,软件测试与验证.

通讯作者:

王林章,E-mail:lzwang@nju.edu.cn

基金项目:

国家重点研发计划(2017YFA0700604);南京大学优秀博士研究生创新能力提升计划B;江苏省研究生科研与实践创新计划


Static Checking of Array Index out of Bounds Defects in C Programs Based on Taint Analysis
Author:
Fund Project:

National Key Research and Development Program of China (2017YFA0700604); Program B for Outstanding PhD Candidate of Nanjing University; Postgraduate Research & Practice Innovation Program of Jiangsu Province of China

  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [63]
  • |
  • 相似文献 [20]
  • | | |
  • 文章评论
    摘要:

    随着移动计算、物联网、云计算、人工智能等领域的飞速发展,也涌现出了很多新的编程语言和编译器,但是C/C++语言依旧是最受欢迎的编程语言之一,而数组是C语言最重要的数据结构之一.当在程序中通过数组下标访问数组元素时,必须确保该下标在该数组的边界之内,否则就会导致数组越界.程序中的数组越界缺陷会使得程序在运行时导致系统崩溃,甚至使攻击者可以截取控制流以执行任意恶意代码.当前针对数组越界的静态检查方法无法达到高精度的分析,尤其是无法处理复杂约束和表达式,过多的误报额外增加了开发者的负担.因此,提出了一种基于污点分析的数组越界的静态检测方法.首先,提出流敏感、上下文敏感的按需指针分析方法,实现数组长度区间分析.然后,提出按需污点分析方法,实现数组下标和数组长度污染情况的计算.最后,定义数组越界缺陷判定规则,提出使用后向数据流分析方法,检测数组下标是否越界.在进行数组越界检测的过程中,为了处理程序中的复杂约束和表达式,在分析过程中将调用约束求解器来判断约束的可满足性.如果没有发现相应的语句,则报告数组越界缺陷警报.同时,实现了自动静态分析工具Carraybound,并通过实验展示了方法的有效性.

    Abstract:

    During the rapid development of mobile computing, IoT, cloud computing, artificial intelligence, etc, many new programming languages and compilers are emerging. Even so, C/C++ language is still one of the most popular languages. And array is one of the most important data structures of C language. It is necessary to check whether the index is within the boundary of the array when using it to access the element of an array in a program. Otherwise, array index out-of-bounds will happen unexpectedly. When there are array index out-of-bounds defects existing in programs, some serious errors may occur during execution, such as system crash. It is even worse that array index out-of-bounds defects open the doors for attackers to take control of the server and execute arbitrary malicious code by carefully constructing input and intercepting the control flow of the programs. Existing static methods for array boundary checking cannot achieve high accuracy and deal with complex constraints and expressions, which lead to too many false positives. And it will increase the burden of developers. In this study, a static checking method is proposed based on taint analysis. First, a flow-sensitive, context-sensitive, and on-demand pointer analysis is proposed to analyze the range of array length. Then, an on-demand taint analysis is performed for all array indices and array length expressions. Finally, the rules are defined for checking array index out of bounds defects and the checking is realized based on backward data flow analysis. During the analysis, in order to deal with complex constraints and expressions, it is proposed to check the satisfiability of the conditions by invoking the constraint solver. If none statement for avoiding array index out-of-bound is found in the program, an array index out-of-bound warning will be reported. An automatic static analysis tool, Carray bound have been implemented, and the experimental results show that Carraybound can work effectively and efficiently.

    参考文献
    [1] CWE. Improper validation of array index. https://cwe.mitre.org/data/definitions/129.html
    [2] Cowan C, Pu C, Maier D, Walpole J, Bakke P, Beattie S, Grier A, Wagle P, Zhang Q, Hinton H. Stackguard:Automatic adaptive detection and prevention of buffer-overflow attacks. In:Proc. of the USENIX Security Symp. 1998,98:63-78.
    [3] CVE. http://www.cvedetails.com/vulnerabilities-by-types.php
    [4] Ye T, Zhang L, Wang L, Li X. An empirical study on detecting and fixing buffer overflow bugs. In:Proc. of the IEEE Int'l Conf. on Software Testing, Verification and Validation (ICST). IEEE, 2016. 91-101.
    [5] Gao F, Wang L, Li X. BovInspector:Automatic inspection and repair of buffer overflow vulnerabilities. In:Proc. of the 31st IEEE/ACM Int'l Conf. on Automated Software Engineering (ASE). IEEE, 2016. 786-791.
    [6] Bao T, Gao F, Zhou Y, Li Y, Wang L, Li X. Automatically validating static buffer overflow warnings based on guided symbolic execution. Journal of Cyber Security, 2016,(2):46-60(in Chinese with English abstract).
    [7] Wang L, Li F, Li L, Feng XB. Principle and practice of taint analysis. Ruan Jian Xue Bao/Journal of Software, 2017,28(4):860-882(in Chinese with English abstract). http://www.jos.org.cn/1000-9825/5190.htm[doi:10.13328/j.cnki.jos.005190]
    [8] Chimdyalwar B. Survey of array out of bound access checkers for C code. In:Proc. of the 5th India Software Engineering Conf. ACM, 2012. 45-48.
    [9] Ming J, Wu D, Xiao G, Wang J, Liu P. TaintPipe:Pipelined symbolic taint analysis. In:Proc. of the 24th {USENIX} Security Symp. ({USENIX} Security 15). 2015. 65-80.
    [10] Newsome J, Song DX. Dynamic taint analysis for automatic detection, analysis, and signature generation of exploits on commodity software. In:Proc. of the Network and Distributed System Security Symp. (NDSS). 2005,5:3-4.
    [11] Khedker U, Sanyal A, Sathe B. Data Flow Analysis:Theory and Practice. CRC Press, 2009.
    [12] Kildall GA. A unified approach to global program optimization. In:Proc. of the 1st Annual ACM SIGACT-SIGPLAN Symp. on Principles of Programming Languages. ACM, 1973. 194-206.
    [13] Galler SJ, Aichernig BK. Survey on test data generation tools. Int'l Journal on Software Tools for Technology Transfer, 2014,16(6):727-751.
    [14] De Moura L, Bjørner N. Z3:An efficient SMT solver. In:Proc. of the Int'l Conf. on Tools and Algorithms for the Construction and Analysis of Systems. Berlin, Heidelberg:Springer-Verlag, 2008. 337-340.
    [15] Z3 theorem prover. https://z3.codeplex.com/
    [16] Gao F, Chen T, Wang Y, Situ L, Wang L, Li X. Carraybound:Static array bounds checking in C programs based on taint analysis. In:Proc. of the 8th Asia-Pacific Symp. on Internetware. ACM, 2016. 81-90.
    [17] Zhou Y. Extensible framework for static vulnerability detection based on taint analysis[Ph.D. Thesis]. Nanjing:Nanjing University, 2017(in Chinese with English abstract).
    [18] Cppcheck. http://cppcheck.net/
    [19] Checkmarx. https://www.checkmarx.com/
    [20] Fortify static code analyzer. https://www.microfocus.com/en-us/products/static-code-analysis-sast/overview
    [21] Costa M, Crowcroft J, Castro M, Rowstron A, Zhou L, Zhang L, Barham P. Vigilante:End-to-end containment of Internet worms. ACM SIGOPS Operating Systems Review, 2005,39(5):133-147.
    [22] Crandall JR, Su Z, Wu SF, Chong FT. On deriving unknown vulnerabilities from zero-day polymorphic and metamorphic worm exploits. In:Proc. of the 12th ACM Conf. on Computer and Communications Security (CCS). ACM, 2005. 235-248.
    [23] Suh GE, Lee JW, Zhang D, Devadas S. Secure program execution via dynamic information flow tracking. ACM SIGPLAN Notices, 2004,39(11):85-96.
    [24] Wang X, Jhi YC, Zhu S, Liu P. Still:Exploit code detection via static taint and initialization analyses. In:Proc. of the 2008 Annual Computer Security Applications Conf. (ACSAC). IEEE, 2008. 289-298.
    [25] Ceara D, Mounier L, Potet ML. Taint dependency sequences:A characterization of insecure execution paths based on input-sensitive cause sequences. In:Proc. of the 3rd Int'l Conf. on Software Testing, Verification, and Validation Workshops. IEEE, 2010. 371-380.
    [26] Andersen LO. Program analysis and specialization for the C programming language[Ph.D. Thesis]. University of Cophenhagen, 1994.
    [27] Steensgaard B. Points-to analysis in almost linear time. In:Proc. of the 23rd ACM SIGPLAN-SIGACT Symp. on Principles of Programming Languages. ACM, 1996. 32-41.
    [28] Chen C, Huo W, Yu H, Feng X. A survey of optimization technology of inclusion-based pointer analysis. Jisuanji Xuebao/Chinese Journal of Computers, 2011,34(7):1224-1238(in Chinese with English abstract).
    [29] Pang L, Su X, Ma P, Zhao L. Research on flow sensitive demand driven alias analysis. Journal of Computer Research and Development, 2015,52(7):1620-1630(in Chinese with English abstract).
    [30] Emami M, Ghiya R, Hendren LJ. Context-sensitive interprocedural points-to analysis in the presence of function pointers. ACM SIGPLAN Notices, 1994,29(6):242-256.
    [31] Wilson RP, Lam MS. Efficient context-sensitive pointer analysis for C programs. ACM SIGPLAN Notices, 1995,30(6):1-12.
    [32] Chase DR, Wegman MN, Zadeck FK. Analysis of pointers and structures. ACM SIGPLAN Notices, 1990,39(4):343-359.
    [33] Xu Z, Miller BP, Reps T. Safety checking of machine code. ACM SIGPLAN Notices, 2000,35(5):70-82.
    [34] Detlefs DL, Leino KRM, Nelson G, Saxe JB. Extended Static Checking. 1998.[doi:10.1007/978-0-387-35358-6_1]
    [35] Leroy X, Rouaix F. Security properties of typed applets. In:Secure Internet Programming. Berlin, Heidelberg:Springer-Verlag, 1999. 147-182.
    [36] Kellogg M, Dort V, Millstein S, Ernst MD. Lightweight verification of array indexing. In:Proc. of the 27th ACM SIGSOFT Int'l Symp. on Software Testing and Analysis (ISSTA). ACM, 2018. 3-14.
    [37] Bodík R, Gupta R, Sarkar V. ABCD:Eliminating array bounds checks on demand. ACM SIGPLAN Notices, 2000,35(5):321-333.
    [38] Venet A, Brat G. Precise and efficient static array bound checking for large embedded C programs. ACM SIGPLAN Notices, 2004,39(6):231-242.
    [39] Nguyen TVN, Irigoin F. Efficient and effective array bound checking. ACM Trans. on Programming Languages and Systems (TOPLAS), 2005,27(3):527-570.
    [40] Popeea C, Xu DN, Chin WN. A practical and precise inference and specializer for array bound checks elimination. In:Proc. of the 2008 ACM SIGPLAN Symp. on Partial Evaluation and Semantics-based Program Manipulation. ACM, 2008. 177-187.
    [41] Wang W, Lei Y, Liu D, Kung D, Csallner C, Zhang D, Kacker R, Kuhn R. A combinatorial approach to detecting buffer overflow vulnerabilities. In:Proc. of the 41st IEEE/IFIP Int'l Conf. on Dependable Systems & Networks (DSN). IEEE, 2011. 269-278.
    [42] Dhurjati D, Adve V. Backwards-compatible array bounds checking for C with very low overhead. In:Proc. of the 28th Int'l Conf. on Software Engineering (ICSE). ACM, 2006. 162-171.
    [43] Loginov A, Yong SH, Horwitz S, Reps T. Debugging via run-time type checking. In:Proc. of the Int'l Conf. on Fundamental Approaches to Software Engineering. Berlin, Heidelberg:Springer-Verlag, 2001. 217-232.
    [44] Steffen JL. Adding run-time checking to the portable C compiler. Software:Practice and Experience, 1992,22(4):305-316.
    [45] Austin TM, Breach SE, Sohi GS. Efficient detection of all pointer and array access errors. In:Proc. of the ACM SIGPLAN 1994 Conf. on Programming Language Design and Implementation (PLDI). ACM, 1994. 290-301.
    [46] Hicks M, Morrisett G, Grossman D, Jim T. Experience with safe manual memory-management in cyclone. In:Proc. of the 4th Int'l Symp. on Memory Management. ACM, 2004. 73-84.
    [47] Wang Y, Gao F, Situ L, Wang L, Chen B, Liu Y, Zhao J, Li X. DangDone:Eliminating dangling pointers via intermediate pointers. In:Proc. of the 10th Asia-Pacific Symp. on Internetware. ACM, 2018. 6.
    [48] Abadi M, Budiu M, Erlingsson Ú, Ligatti J. Control-flow integrity principles, implementations, and applications. ACM Trans. on Information and System Security (TISSEC), 2009,13(1):4.
    [49] Sutton M, Greene A, Amini P. Fuzzing:Brute Force Vulnerability Discovery. Pearson Education, 2007.
    [50] Godefroid P, Kiezun A, Levin MY. Grammar-based whitebox fuzzing. ACM SIGPLAN Notices, 2008,43(6):206-215.
    [51] Godefroid P, Levin MY, Molnar DA. Automated Whitebox fuzz testing. In:Proc. of the Network and Distributed System Security Symp. (NDSS). 2008,8:151-166.
    [52] McNally R, Yiu K, Grove D, Gerhardy D. Fuzzing:The state of the art. Defence Science and Technology Organisation Edinburgh, 2012. http://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=15CF9A7FD272D62D76FB5ED26DA3808F?doi=10.1.1.461.4627&rep=rep1&type=pdf
    [53] Cadar C, Ganesh V, Pawlowski PM, Dill DL, Engler DR. EXE:Automatically generating inputs of death. ACM Trans. on Information and System Security (TISSEC), 2008,12(2):1-38.
    [54] Godefroid P, Klarlund N, Sen K. DART:Directed automated random testing. ACM SIGPLAN Notices, 2005,40(6):213-223.
    [55] Xu RG, Godefroid P, Majumdar R. Testing for buffer overflows with length abstraction. In:Proc. of the 2008 Int'l Symp. on Software Testing and Analysis (ISSTA). ACM, 2008. 27-38.
    [56] Stephens N, Grosen J, Salls C, Dutcher A, Wang R, Corbetta J, Shoshitaishvili Y, Kruegel C, Vigna G. Driller:Augmenting fuzzing through selective symbolic execution. In:Proc. of the Network and Distributed System Security Symp. (NDSS). 2016,16(2016):1-16.
    [57] Pak BS. Hybrid fuzz testing:Discovering software bugs via fuzzing and symbolic execution[Ph.D. Thesis]. School of Computer Science, Carnegie Mellon University, 2012.
    附中文参考文献:
    [6] 鲍铁匀,高凤娟,周严,李游,王林章,李宣东.基于目标制导符号执行的静态缓冲区溢出警报自动确认技术.信息安全学报,2016, (2):46-60.
    [7] 王蕾,李丰,李炼,冯晓兵.污点分析技术的原理和实践应用.软件学报,2017,28(4):860-882. http://www.jos.org.cn/1000-9825/5190.htm[doi:10.13328/j.cnki.jos.005190]
    [17] 周严.基于污点分析的静态漏洞检测可扩展框架[博士学位论文].南京:南京大学,2017.
    [28] 陈聪明,霍玮,于洪涛,冯晓兵.基于包含的指针分析优化技术综述.计算机学报,2011,34(7):1224-1238.
    [29] 逄龙,苏小红,马培军,赵玲玲.流敏感按需指针别名分析算法.计算机研究与发展,2015,52(7):1620-1630.
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

高凤娟,王豫,陈天骄,司徒凌云,王林章,李宣东.基于污点分析的数组越界缺陷的静态检测方法.软件学报,2020,31(10):2983-3003

复制
分享
文章指标
  • 点击次数:3980
  • 下载次数: 7364
  • HTML阅读次数: 3940
  • 引用次数: 0
历史
  • 收稿日期:2020-01-02
  • 最后修改日期:2020-04-04
  • 在线发布日期: 2020-06-11
  • 出版日期: 2020-10-06
文章二维码
您是第19528307位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号