Data Flow Analysis Method Based on Progressive Dynamic for Binary Programs
Author:
Affiliation:

Clc Number:

TP311

  • Article
  • | |
  • Metrics
  • |
  • Reference [48]
  • |
  • Related [20]
  • |
  • Cited by
  • | |
  • Comments
    Abstract:

    Binary program analysis techniques are widely applied in software security testing, malware analysis and detection, etc. Dynamic analysis is an important analysis method that can accurately show the running status of programs. However, it is confronted with some challenges, such as too high load during target program running and difficulty in dissecting the data structure information in detail. This study proposes a new data flow analysis method based on progressive expansion for binary programs. By taking full advantage of the ability of online data flow analysis, it focuses on the fine-grained analysis for partial program and expands the analysis range progressively to cover the entire program. The method utilizes a divide-and-conquer strategy that can reduce the performance impact on the runtime of the target program and thereby enable the execution of the target code segment sensitive to delay. Meanwhile, this study also presents a correlation analysis method for function parameters based on the memory reference relationship. It can detect the data flow propagation at the function call level and aid in the recovery of the internal data structures of parameters. In the end, this study shows the results of the experiments on the programs in the real environment, which suggest the feasibility and effectiveness of the proposed method. This method does not introduce significant extra analysis overhead while reducing the performance impact on the target program, capable of being applied in binary program analyses in practice.

    Reference
    [1] Alwarebvtes. Cybercrime tactics and techniques: 2017 state of malware. https://www.malwarebytes.com/pdf/white-papers/CTNT-Q4-17.pdf
    [2] Egele M, Scholte T, Kirda E, Kruegel C. A survey on automated dynamic malware-analysis techniques and tools. ACM Computing Surveys, 2012, 44(2): 6. [doi: 10.1145/2089125.2089126]
    [3] 张健, 张超, 玄跻峰, 熊英飞, 王千祥, 梁彬, 李炼, 窦文生, 陈振邦, 陈立前, 蔡彦. 程序分析研究进展. 软件学报, 2019, 30(1): 80–109. http://www.jos.org.cn/1000-9825/5651.htm
    Zhang J, Zhang C, Xuan JF, Xiong YF, Wang QX, Liang B, Li L, Dou WS, Chen ZB, Chen LQ, Cai Y. Recent progress in program analysis. Ruan Jian Xue Bao/Journal of Software, 2019, 30(1): 80–109 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/5651.htm
    [4] Ugarte-Pedrero X, Balzarotti D, Santos I, Bringas PG. SoK: Deep packer inspection: A longitudinal study of the complexity of run-time packers. In: Proc. of the 2015 IEEE Symp. on Security and Privacy. San Jose: IEEE, 2015. 659–673.
    [5] Lengyel TK, Maresca S, Payne BD, Webster GD, Vogl S, Kiayias A. Scalability, fidelity and stealth in the DRAKVUF dynamic malware analysis system. In: Proc. of the 30th Annual Computer Security Applications Conf. New Orleans: ACM, 2014. 386–395.
    [6] Willems C, Holz T, Freiling F. Toward automated dynamic malware analysis using CWSandbox. IEEE Security & Privacy, 2007, 5(2): 32–39. [doi: 10.1109/MSP.2007.45]
    [7] Kemerlis VP, Portokalidis G, Jee K, Keromytis AD. Libdft: Practical dynamic data flow tracking for commodity systems. ACM SIGPLAN Notices, 2012, 47(7): 121–132. [doi: 10.1145/2365864.2151042]
    [8] Jee K, Portokalidis G, Kemerlis VP, Ghosh S, August DI, Keromytis AD. A general approach for efficiently accelerating software-based dynamic data flow tracking on commodity hardware. In: Proc. of the 19th Internet Society (ISOC) Symp. on Network and Distributed Systems Security. San Diego, 2012.
    [9] Newsome J, Song DX. Dynamic taint analysis for automatic detection, analysis, and signature generation of exploits on commodity software. In: Proc. of the 12th Annual Network and Distributed System Security Symp. San Diego, 2005.
    [10] Ganai M, Lee D, Gupta A. DTAM: Dynamic taint analysis of multi-threaded programs for relevancy. In: Proc. of the 20th ACM SIGSOFT Int’l Symp. on the Foundations of Software Engineering. Cary: ACM, 2012. 46.
    [11] Jee K, Kemerlis VP, Keromytis AD, Portokalidis G. ShadowReplica: Efficient parallelization of dynamic data flow tracking. In: Proc. of the 2013 ACM SIGSAC Conf. on Computer & Communications Security. Berlin: ACM, 2013. 235–246.
    [12] Ming J, Wu DH, Xiao GY, Wang J, Liu P. TaintPipe: Pipelined symbolic taint analysis. In: Proc. of the 24th USENIX Conf. on Security Symp. Washington: USENIX Association, 2015. 65–80.
    [13] Ming J, Wu DH, Wang J, Xiao GY, Liu P. StraightTaint: Decoupled offline symbolic taint analysis. In: Proc. of the 31st IEEE/ACM Int’l Conf. on Automated Software Engineering. Singapore: IEEE, 2016. 308–319.
    [14] Cui BJ, Wang FW, Guo T, Dong GW. A practical off-line taint analysis framework and its application in reverse engineering of file format. Computers & Security, 2015, 51: 1–15.
    [15] Stamatogiannakis M, Groth P, Bos H. Looking inside the black-box: Capturing data provenance using dynamic instrumentation. In: Proc. of the 5th Int’l Provenance and Annotation of Data and Processes. Cologne: Springer, 2014. 155–167.
    [16] Zhu D, Jung J, Song D, Kohno T, Wetherall D. TaintEraser: Protecting sensitive data leaks using application-level taint tracking. ACM SIGOPS Operating Systems Review, 2011, 45(1): 142–154. [doi: 10.1145/1945023.1945039]
    [17] Dolan-Gavitt B, Hodosh J, Hulin P, Leek T, Whelan R. Repeatable reverse engineering with PANDA. In: Proc. of the 5th Program Protection and Reverse Engineering Workshop (PPREW). Los Angeles: ACM, 2015. 4.
    [18] Bauman E, Ayoade G, Lin ZQ. A survey on hypervisor-based monitoring: Approaches, applications, and evolutions. ACM Computing Surveys, 2015, 48(1): 10. [doi: 10.1145/2775111]
    [19] D’Elia DC, Coppa E, Nicchi S, Palmaro F, Cavallaro L. SoK: Using dynamic binary instrumentation for security (and how you may get caught red handed). In: Proc. of the 2019 ACM Asia Conf. on Computer and Communications Security (Asia CCS2019). Auckland: ACM, 2019. 15–27.
    [20] Myers EW. AnO(ND) difference algorithm and its variations. Algorithmica, 1986, 1(1–4): 251–266. [doi: 10.1007/BF01840446]
    [21] Luk CK, Cohn R, Muth R, Patil H, Klauser A, Lowney G, Wallace S, Reddi VJ, Hazelwood K. Pin: Building customized program analysis tools with dynamic instrumentation. ACM SIGPLAN Notices, 2005, 40(6): 190–200. [doi: 10.1145/1064978.1065034]
    [22] Polino M, Continella A, Mariani S, D’Alessio S, Fontana L, Gritti F, Zanero S. Measuring and defeating anti-instrumentation-equipped malware. In: Proc. of the 14th Int’l Conf. on Detection of Intrusions and Malware, and Vulnerability Assessment (DIMVA). Bonn: Springer, 2017. 73–96.
    [23] Espinoza AM, Knockel J, Comesaña-Alfaro P, Crandall JR. V-DIFT: Vector-based dynamic information flow tracking with application to locating cryptographic keys for reverse engineering. In: Proc. of the 11th Int’l Conf. on Availability, Reliability and Security. Salzburg: IEEE, 2016. 266–271.
    [24] Qin F, Wang C, Li ZM, Kim HS, Zhou YY, Wu YF. LIFT: A low-overhead practical information flow tracking system for detecting security attacks. In: Proc. of the 39th Annual IEEE/ACM Int’l Symp. on Microarchitecture. Orlando: IEEE, 2006. 135–148.
    [25] Wen Y, Zhao JJ, Chen H. Towards thwarting data leakage with memory page access interception. In: Proc. of the 12th IEEE Int’l Conf. on Dependable, Autonomic and Secure Computing. Dalian: IEEE, 2014. 26–31.
    [26] Zeng JY, Fu YC, Lin ZQ. PEMU: A pin highly compatible Out-of-VM dynamic binary instrumentation framework. In: Proc. of the 11th ACM SIGPLAN/SIGOPS Int’l Conf. on Virtual Execution Environments. Istanbul: ACM, 2015. 147–160.
    [27] Wang CW, Shieh S. SWIFT: Decoupled system-wide information flow tracking and its optimizations. Journal of Information Science and Engineering, 2015, 31(4): 1413–1429. [doi: 10.6688/JISE.2015.31.4.15]
    [28] Lovat E, Fromm A, Mohr M, Pretschner A. SHRIFT system-wide HybRid information flow tracking. In: Proc. of the 30th IFIP TC 11 Int’l Conf. on ICT Systems Security and Privacy Protection. Hamburg: Springer, 2015. 371–385.
    [29] Sullivan GT, Bruening DL, Baron I, Garnett T, Amarasinghe S. Dynamic native optimization of interpreters. In: Proc. of the 2003 ACM SIGPLAN Workshop on Interpreters, Virtual Machines and Emulators. San Diego: ACM, 2003. 50–57.
    [30] Nethercote N, Seward J. Valgrind: A program supervision framework. Electronic Notes in Theoretical Computer Science, 2003, 89(2): 44–46. [doi: 10.1016/S1571-0661(04)81042-9]
    [31] Yin H, Song D, Egele M, Kruegel C, Kirda E. Panorama: Capturing system-wide information flow for malware detection and analysis. In: Proc. of the 14th ACM Conf. on Computer and Communications Security. Alexandria: ACM, 2007. 116–127.
    [32] Henderson A, Prakash A, Yan LK, Hu XC, Wang XJW, Zhou RD, Yin H. Make it work, make it right, make it fast: Building a platform-neutral whole-system dynamic binary analysis platform. In: Proc. of the 2014 Int’l Symp. on Software Testing and Analysis. San Jose: ACM, 2014. 248–258.
    [33] Ji Y, Lee S, Downing E, Wang WR, Fazzini M, Kim T, Orso A, Lee W. RAIN: Refinable attack investigation with on-demand inter-process information flow tracking. In: Proc. of the 2017 ACM SIGSAC Conf. on Computer and Communications Security. Dallas: ACM, 2017. 377–390.
    [34] 马金鑫, 李舟军, 张涛, 沈东, 章张锴. 基于执行踪迹离线索引的污点分析方法研究. 软件学报, 2017, 28(9): 2388–2401. http://www.jos.org.cn/1000-9825/5179.htm
    Ma JX, Li ZJ, Zhang T, Shen D, Zhang ZK. Taint analysis method based on offline indices of instruction trace. Ruan Jian Xue Bao/Journal of Software, 2017, 28(9): 2388-2401 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/5179.htm
    [35] Bosman E, Slowinska A, Bos H. Minemu: The world’s fastest taint tracker. In: Proc. of the 14th Int’l Symp. on Recent Advances in Intrusion Detection. Menlo Park: Springer, 2011. 1–20.
    [36] Cheng W, Zhao Q, Yu B, Hiroshige S. TaintTrace: Efficient flow tracing with dynamic binary rewriting. In: Proc. of the 11th IEEE Symp. on Computers and Communications. Cagliari: IEEE, 2006. 749–754.
    [37] Clause J, Li WC, Orso A. Dytan: A generic dynamic taint analysis framework. In: Proc. of the 2007 Int’l Symp. on Software Testing and Analysis. London: ACM, 2007. 196–206.
    [38] Kang MG, McCamant S, Poosankam P, Song D. Dta++: Dynamic taint analysis with targeted control-flow propagation. In: Proc. of the Network and Distributed System Security Symp. San Diego, 2011.
    [39] Wang XF, Ma HT, Jing LS. A dynamic marking method for implicit information flow in dynamic taint analysis. In: Proc. of the 8th Int’l Conf. on Security of Information and Networks. Sochi: ACM, 2015. 275–282.
    [40] Zhu EZ, Liu F, Wang Z, Liang AL, Zhang YW, Li XJ, Li XJ. Dytaint: The implementation of a novel lightweight 3-state dynamic taint analysis framework for x86 binary programs. Computers & Security, 2015, 52: 51–69. [doi: 10.1016/j.cose.2015.03.008]
    [41] Xiao GY, Wang J, Liu P, Ming J, Wu DH. Program-object level data flow analysis with applications to data leakage and contamination forensics. In: Proc. of the 6th ACM Conf. on Data and Application Security and Privacy (CODASPY2016). New Orleans: ACM, 2016. 277–284.
    [42] Bronevetsky G, Fernandes R, Marques D, Pingali K, Stodghill P. Recent advances in checkpoint/recovery systems. In: Proc. of the 20th IEEE Int’l Parallel & Distributed Processing Symp. Rhodes: IEEE, 2006. 8.
    [43] Li T, Shafique M, Ambrose JA, Henkel J, Parameswaran S. Fine-grained checkpoint recovery for application-specific instruction-set processors. IEEE Transactions on Computers, 2017, 66(4): 647–660. [doi: 10.1109/TC.2016.2606378]
    [44] Cui L, Wo TY, Li B, Li JX, Shi B, Huai JP. PARS: A page-aware replication system for efficiently storing virtual machine snapshots. In: Proc. of the 11th ACM SIGPLAN/SIGOPS Int’l Conf. on Virtual Execution Environments (VEE2015). Istanbul: ACM, 2015. 215–228.
    [45] Chow J, Garfinkel T, Chen PM. Decoupling dynamic program analysis from execution in virtual environments. In: Proc. of the 2008 USENIX Annual Technical Conf. (ATC). Boston: USENIX Association, 2008. 1–14.
    [46] Ren SR, Tan L, Li CQ, Xiao Z, Song WJ. Leveraging hardware-assisted virtualization for deterministic replay on commodity multi-core processors. IEEE Transactions on Computers, 2018, 67(1): 45–58. [doi: 10.1109/TC.2017.2727492]
    Cited by
Get Citation

潘家晔,庄毅,孙炳林.基于渐进扩展的二进制程序数据流分析方法.软件学报,2022,33(9):3249-3270

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:April 30,2019
  • Revised:March 24,2020
  • Online: July 15,2022
  • Published: September 06,2022
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063