Analysis and Testing of Indirect Jump Table Solving Algorithms in Disassembly Tools
Author:
Affiliation:

Clc Number:

TP311

  • Article
  • | |
  • Metrics
  • |
  • Reference [61]
  • |
  • Related [18]
  • | | |
  • Comments
    Abstract:

    Disassembly of binary codes is hard but necessary for improving the security of binary software. One of the major reasons for the difficult binary disassembly is that the compilers create many indirect jump tables in the binary code for efficiency. In order to solve the targets of the indirect jump table, mainstream disassembly tools use various strategies. However, the details of the implementation of these strategies and their effectiveness are not well studied. To help researchers to well understand the algorithm implementation and performance of disassembly tools, this study first systematically summarizes the strategies used by disassembly tools to solve indirect jump tables; then the study builds an automatic framework for testing indirect jump tables, based on which a large-scale testsuite on indirect jump tables (2410455 jump tables) can be generated. Lastly, this study evaluates the performance of the disassembly tools in solving indirect jump tables on the testsuite and manually analyzes the errors introduced by each strategy of the disassembly tools. In addition, this study finds six bugs in the implementation of the disassembly tools benefiting from the systematic summary of the implementation of the disassembly tool algorithm.

    Reference
    [1] Abadi M, Budiu M, Erlingsson Ú, Ligatti J. Control-flow integrity principles, implementations, and applications. ACM Transactions on Information and System Security, 2009, 13(1): 4. [doi: 10.1145/1609956.1609960]
    [2] Zhang M, Sekar R. Control flow and code integrity for COTS binaries: An effective defense against real-world ROP attacks. In: Proc. of the 31st Annual Computer Security Applications Conf. Los Angeles: ACM, 2015. 91–100.
    [3] Zhang C, Wei T, Chen ZF, Duan L, Szekeres L, McCamant S, Song D, Zou W. Practical control flow integrity and randomization for binary executables. In: Proc. of the 2013 IEEE Symp. on Security and Privacy. Berkeley: IEEE, 2013. 559–573.
    [4] Payer M, Barresi A, Gross TR. Fine-grained control-flow integrity through binary hardening. In: Proc. of the 12th Int’l Conf. on Detection of Intrusions and Malware, and Vulnerability Assessment. Milan: Springer, 2015. 144–164.
    [5] Wang MH, Yin H, Bhaskar AV, Su PR, Feng DG. Binary code continent: Finer-grained control flow integrity for stripped binaries. In: Proc. of the 31st Annual Computer Security Applications Conf. Los Angeles: ACM, 2015. 331–340.
    [6] Pappas V, Polychronakis M, Keromytis AD. Smashing the gadgets: Hindering return-oriented programming using in-place code randomization. In: Proc. of the 2012 IEEE Symp. on Security and Privacy. San Francisco: IEEE, 2012. 601–615.
    [7] 张正, 薛静锋, 张静慈, 陈田, 谭毓安, 李元章, 张全新. 进程控制流完整性保护技术综述. 软件学报, 2023, 34(1): 489-508. http://www.jos.org.cn/1000-9825/6436.htm
    Zhang Z, Xue JF, Zhang JC, Chen T, Tan YA, Li YZ, Zhang QX. Survey on control-flow integrity techniques. Ruan Jian Xue Bao/Journal of Software, 2023, 34(1): 489–508 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/6436.htm
    [8] Chandramohan M, Xue YX, Xu ZZ, Liu Y, Cho CY, Tan HBK. BinGo: Cross-architecture cross-OS binary search. In: Proc. of the 24th ACM SIGSOFT Int’l Symp. on Foundations of Software Engineering. Seattle: ACM, 2016. 678–689.
    [9] Feng Q, Zhou RD, Xu CC, Cheng Y, Testa B, Yin H. Scalable graph-based bug search for firmware images. In: Proc. of the 2016 ACM SIGSAC Conf. on Computer and Communications Security. Vienna: ACM, 2016. 480–491.
    [10] Hu YK, Zhang YY, Li JR, Gu DW. Binary code clone detection across architectures and compiling configurations. In: Proc. of the 25th Int’l Conf. on Program Comprehension. Buenos Aires: IEEE, 2017. 88–98.
    [11] Bernat AR, Miller BP. Anywhere, any-time binary instrumentation. In: Proc. of the 10th ACM SIGPLAN-SIGSOFT Workshop on Program Analysis For Software Tools. Szeged: ACM, 2011. 9–16.
    [12] Shoshitaishvili Y, Wang RY, Salls C, Stephens N, Polino M, Dutcher A, Grosen J, Feng SJ, Hauser C, Kruegel C, Vigna G. SOK: (state of) the art of war: Offensive techniques in binary analysis. In: Proc. of the 2016 IEEE Symp. on Security and Privacy. San Jose: IEEE, 2016. 138–157.
    [13] Cova M, Felmetsger V, Banks G, Vigna G. Static detection of vulnerabilities in x86 executables. In: Proc. of the 22nd Annual Computer Security Applications Conf. Miami: ACM, 2006. 269–278.
    [14] Li Z, Zou DQ, Xu SH, Ou XY, Jin H, Wang SJ, Deng ZJ, Zhong YY. VulDeePecker: A deep learning-based system for vulnerability detection. In: Proc. of the 25th Annual Network and Distributed System Security Symp. San Diego: NDSS, 2018.
    [15] Pewny J, Garmany B, Gawlik R, Rossow C, Holz T. Cross-architecture bug search in binary executables. In: Proc. of the 2015 IEEE Symp. on Security and Privacy. San Jose: IEEE, 2015. 709–724.
    [16] Meng XZ, Miller BP. Binary code is not easy. In: Proc. of the 25th Int’l Symp. on Software Testing and Analysis. Saarbrücken: ACM, 2016. 24–35.
    [17] University of California. Angr. 2016. https://Angr.io/
    [18] NSA. Ghidra. 2019. https://Ghidra-sre.org/
    [19] radareorg. radare2. 2021. https://github.com/radareorg/radare2
    [20] University of Maryland. Dyninst. 2016. https://www.Dyninst.org/
    [21] Yang XJ, Chen Y, Eide E, Regehr J. Finding and understanding bugs in C compilers. In: Proc. of the 32nd ACM SIGPLAN Conf. on Programming Language Design and Implementation. San Jose: ACM, 2011. 283–294.
    [22] GNU. GCC, The GNU compiler collection. 2023. https://gcc.gnu.org/
    [23] LLVM Community. 2023. The LLVM compiler infrastructure. https://llvm.org/
    [24] Wikipedia. Ground Truth. 2023. https://en.wikipedia.org/wiki/Ground_truth
    [25] Stephens N, Grosen J, Salls C, Dutcher A, Wang RY, Corbetta J, Shoshitaishvili Y, Kruegel C, Vigna G. Driller: Augmenting fuzzing through selective symbolic execution. In: Proc. of the 23rd Annual Network and Distributed System Security Symp. San Diego: NDSS, 2016.
    [26] Peng H, Shoshitaishvili Y, Payer M. T-Fuzz: Fuzzing by program transformation. In: Proc. of the 2018 IEEE Symp. on Security and Privacy. San Francisco: IEEE, 2018. 697–710.
    [27] Chen YH, Mu DL, Xu J, Sun ZC, Shen WB, Xing XY, Lu L, Mao B. PTrix: Efficient hardware-assisted fuzzing for COTS binary. In: Proc. of the 2019 ACM Asia Conf. on Computer and Communications Security. Auckland: ACM, 2019. 633–645.
    [28] Zhao L, Duan Y, Yin H, Xuan JF. Send hardest problems my way: Probabilistic path prioritization for hybrid fuzzing. In: Proc. of the 26th Annual Network and Distributed System Security Symp. San Diego: NDSS, 2019.
    [29] Muench M, Nisi D, Francillon A, Balzarotti D. Avatar2: A multi-target orchestration platform. In: Proc. of the 2018 Workshop on Binary Analysis Research. San Diego: BAR, 2018.
    [30] Al3xtjames. Ghidra firmware utilities. 2022. https://github.com/al3xtjames/Ghidra-firmware-utils
    [31] Rawat S, Jain V, Kumar A, Cojocar L, Giuffrida C, Bos H. VUzzer: Application-aware evolutionary fuzzing. In: Proc. of the 24th Annual Network and Distributed System Security Symp. San Diego: NDSS, 2017.
    [32] Kilgallon S, De La Rosa L, Cavazos J. Improving the effectiveness and efficiency of dynamic malware analysis with machine learning. In: Proc. of the 2017 Resilience Week. Wilmington: IEEE, 2017. 30–36.
    [33] De La Rosa L, Kilgallon S, Vanderbruggen T, Cavazos J. Efficient characterization and classification of malware using deep learning. In: Proc. of the 2018 Resilience Week. Denver: IEEE, 2018. 77–83.
    [34] Hernandez G, Fowze F, Tian D, Yavuz T, Butler KRB. FirmUSB: Vetting USB device firmware using domain informed symbolic execution. In: Proc. of the 2017 ACM SIGSAC Conf. on Computer and Communications Security. Dallas: ACM, 2017. 2245–2262.
    [35] Alasmary H, Anwar A, Park J, Choi J, Nyang D, Mohaisen A. Graph-based comparison of IoT and Android malware. In: Proc. of the 7th Int’l Conf. on Computational Data and Social Networks. Shanghai: Springer, 2018. 259–272.
    [36] Chen P, Xu J, Hu ZS, Xing XY, Zhu MH, Mao B, Liu P. What you see is not what you get! Thwarting just-in-time ROP with chameleon. In: Proc. of the 47th Annual IEEE/IFIP Int’l Conf. on Dependable Systems and Networks. Denver: IEEE, 2017. 451–462.
    [37] Miller BP, Christodorescu M, Iverson R, Kosar T, Mirgorodskii A, Popovici F. Playing inside the black box: Using dynamic instrumentation to create security holes. Parallel Processing Letters, 2001, 11(2–3): 267–280.
    [38] Armstrong W, Christen P, McCreath E, Rendell AP. Dynamic algorithm selection using reinforcement learning. In: Proc. of the 2006 Int’l Workshop on Integrating AI and Data Mining. Hobart: IEEE, 2006. 18–25.
    [39] Mußler J, Lorenz D, Wolf F. Reducing the overhead of direct application instrumentation using prior static analysis. In: Proc. of the 17th Int’l European Conf. on Parallel Processing. Bordeaux: Springer, 2011. 65–76.
    [40] Sidiroglou S, Laadan O, Perez C, Viennot N, Nieh J, Keromytis AD. ASSURE: Automatic software self-healing using rescue points. ACM SIGARCH Computer Architecture News, 2009, 37(1): 37–48. [doi: 10.1145/2528521.1508250]
    [41] Cifuentes C, Van Emmerik M. Recovery of jump table case statements from binary code. Science of Computer Programming, 2001, 40(2–3): 171–188.
    [42] LLVM Community. MachineJumpTableInfo class reference. 2023. https://llvm.org/docs/doxygen/classllvm_1_1MachineJumpTableInfo.html
    [43] Balakrishnan G, Reps T. Analyzing memory accesses in x86 executables. In: Proc. of the 13th Int’l Conf. on Compiler Construction. Barcelona: Springer, 2004. 5–23.
    [44] Balakrishnan G, Reps T. WYSINWYX: What you see is not what you eXecute. ACM Transactions on Programming Languages and Systems, 2010, 32(6): 23. [doi: 10.1145/1749608.1749612]
    [45] Kinder J, Veith H. Jakstab: A static analysis platform for binaries. In: Proc. of the 20th Int’l Conf. on Computer Aided Verification. Princeton: Springer, 2008. 423–427.
    [46] Williams-King D, Kobayashi H, Williams-King K, Patterson G, Spano F, Wu YJ, Yang JF, Kemerlis VP. Egalito: Layout-agnostic binary recompilation. In: Proc. of the 25th Int’l Conf. on Architectural Support for Programming Languages and Operating Systems. Lausanne: ACM, 2020. 133–147.
    [47] LLVM Community. LLVM’s analysis and transform passes. 2022. https://llvm.org/docs/Passes.html
    [48] Andriesse D, Chen X, Van Der Veen V, Slowinska A, Bos H. An in-depth analysis of disassembly on full-scale x86/x64 binaries. In: Proc. of the 25th USENIX Conf. on Security Symp. Austin: USENIX Association, 2016. 583–600.
    [49] Pang CB, Yu RT, Chen YH, Koskinen E, Portokalidis G, Mao B, Xu J. SoK: All you ever wanted to know about x86/x64 binary disassembly but were afraid to ask. In: Proc. of the 2021 IEEE Symp. on Security and Privacy (SP). San Francisco: IEEE, 2021. 833–851.
    [50] Pang CB, Zhang TT, Yu RT, Mao B, Xu J. Ground truth for binary disassembly is not easy. In: Proc. of the 31st USENIX Security Symp. Boston: USENIX, 2022. 2479–2495.
    [51] Jiang MH, Dai QM, Zhang WL, Chang R, Zhou YJ, Luo XP, Wang RY, Liu Y, Ren K. A comprehensive study on ARM disassembly tools. IEEE Transactions on Software Engineering, 2023, 49(4): 1683–1703. [doi: 10.1109/TSE.2022.3187811]
    [52] Bigot PA, Debray S. Return value placement and tail call optimization in high level languages. The Journal of Logic Programming, 1999, 38(1): 1–29. [doi: 10.1016/S0743-1066(98)80001-0]
    [53] Weiser M. Program slicing. IEEE Transactions on Software Engineering, 1984, SE-10(4): 352–357. [doi: 10.1109/TSE.1984.5010248]
    [54] LLVM Community. The “MC” layer. 2023. https://llvm.org/docs/CodeGenerator.html#the-mc-layer
    [55] GNU. GNU binutils. 2023. https://www.gnu.org/software/binutils/
    [56] GNU. GCC. RTL representation. 2022. https://gcc.gnu.org/onlinedocs/gccint/RTL.html
    [57] Bartlett J. Common and useful assembler directives. In: Bartlett J, ed. Learn to Program with Assembly: Foundational Learning for New Programmers. Berkeley: Apress, 2021. 165–171.
    [58] NSA. P-Code reference manual. 2017. https://spinsel.dev/assets/2020-06-17-Ghidra-brainfuck-processor-1/Ghidra_docs/language_spec/html/pcoderef.html
    [59] NSA. P-Code operation reference. 2020. https://spinsel.dev/assets/2020-06-17-Ghidra-brainfuck-processor-1/Ghidra_docs/language_spec/html/pcodedescription.html#cpui_int_negate
    [60] Sasaki Y. The truth of the F-measure. 2007. https://www.cs.odu.edu/~mukka/cs795sum09dm/Lecturenotes/Day3/F-measure-YS-26Oct07.pdf
    Cited by
    Comments
    Comments
    分享到微博
    Submit
Get Citation

庞成宾,徐雪兰,张天泰,茅兵.反汇编工具中间接跳转表求解算法分析与测试.软件学报,2024,35(10):4623-4641

Copy
Share
Article Metrics
  • Abstract:516
  • PDF: 2111
  • HTML: 890
  • Cited by: 0
History
  • Received:November 30,2022
  • Revised:February 02,2023
  • Online: October 18,2023
  • Published: October 06,2024
You are the first2044842Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063