DDoop: Incremental Pointer Analysis Framework Based on Differential Datalog Evaluation
Author:
Affiliation:

  • Article
  • | |
  • Metrics
  • |
  • Reference [49]
  • |
  • Related [20]
  • | | |
  • Comments
    Abstract:

    Pointer analysis is a core and fundamental technology for software compiler optimization and bug detection. Existing classic pointer analysis frameworks such as Doop will transform the programs to be analyzed and analysis algorithms into Datalog evaluation problems like too large program size and solve them. As a result, the analysis time overhead of a single solution can be high, and the program analysis overhead can hardly be afforded especially in situations where programs are frequently changed and released. In recent years, as a technology that effectively reemploys existing analysis results and improves analysis efficiency under frequent code changes, incremental analysis has caught increasing attention. However, since current incremental pointer analysis techniques are often designed for specific algorithms, the supported pointer analysis options are limited and their usability is significantly restricted. To this end, this study designs and implements Differential Doop (DDoop), an incremental pointer analysis framework based on Differential Datalog evaluation. DDoop implements incremental input fact generation and automatic rewriting for incremental analysis rules, expressing incremental analysis problems of multi-version programs as Differential Datalog evaluation problems. Finally, a mature Differential Datalog solution engine like DDlog can be fully utilized to achieve end-to-end incremental pointer analysis, maximizing compatibility and reuse of existing pointer analysis implementations in Doop and providing transparent support for incrementalization. Additionally, experimental evaluation of DDoop is conducted on widely adopted real-world programs. The results show that compared to the non-incremental Doop framework, DDoop has a significant performance advantage while highly compatible with a variety of pointer analysis rules existing in Doop.

    Reference
    [1] Cai YD, Yao PS, Zhang C. Canary: Practical static detection of inter-thread value-flow bugs. In: Proc. of the 42nd ACM SIGPLAN Int’l Conf. on Programming Language Design and Implementation. New York: ACM, 2021. 1126–1140.
    [2] Grech N, Smaragdakis Y. P/Taint: Unified points-to and taint analysis. Proc. of the ACM on Programming Languages, 2017, 1(OOPSLA): 102.
    [3] Pradel M, Jaspan C, Aldrich J, Gross TR. Statically checking API protocol conformance with mined multi-object specifications. In: Proc. of the 34th Int’l Conf. on Software Engineering (ICSE). Zurich: IEEE, 2012. 925–935.
    [4] Zhang QR, Lyu MR, Yuan H, Su ZD. Fast algorithms for Dyck-CFL-reachability with applications to alias analysis. In: Proc. of the 34th ACM SIGPLAN Conf. on Programming Language Design and Implementation. New York: ACM, 2013. 435–446.
    [5] Smaragdakis Y, Balatsouras G. Pointer analysis. Foundations and Trends® in Programming Languages, 2015, 2(1): 1–69.
    [6] Tan T, Li Y, Xue JL. Efficient and precise points-to analysis: Modeling the heap by merging equivalent automata. In: Proc. of the 38th ACM SIGPLAN Conf. on Programming Language Design and Implementation. Barcelona: ACM, 2017. 278–291.
    [7] Li Y, Tan T, Møller A, Smaragdakis Y. Scalability-first pointer analysis with self-tuning context-sensitivity. In: Proc. of the 26th ACM Joint Meeting on European Software Engineering Conf. and the Symp. on the Foundations of Software Engineering. Lake Buena Vista: ACM, 2018. 129–140.
    [8] 刘博涵, 张贺, 董黎明. DevOps中国调查研究. 软件学报, 2019, 30(10): 3206–3226. http://www.jos.org.cn/1000-9825/5796.htm
    Liu BH, Zhang H, Dong LM. Survey on state of DevOps in China. Ruan Jian Xue Bao/Journal of Software, 2019, 30(10): 3206–3226 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/5796.htm
    [9] Liu BZ, Huang J, Rauchwerger L. Rethinking incremental and parallel pointer analysis. ACM Trans. on Programming Languages and Systems, 2019, 41(1): 6.
    [10] Liu BZ, Huang J. SHARP: Fast incremental context-sensitive pointer analysis for Java. Proc. of the ACM on Programming Languages, 2022, 6(OOPSLA1): 88.
    [11] Bravenboer M, Smaragdakis Y. Strictly declarative specification of sophisticated points-to analyses. In: Proc. of the 24th ACM SIGPLAN Conf. on Object Oriented Programming Systems Languages and Applications. Orlando: ACM, 2009. 243–262. [doi: 10.1145/1640089.1640108]
    [12] Li Y, Tan T, Møller A, Smaragdakis Y. Precision-guided context sensitivity for pointer analysis. Proc. of the ACM on Programming Languages, 2018, 2(OOPSLA): 1–29.
    [13] Ma WJ, Yang SY, Tan T, Ma XX, Xu C, Li Y. Context sensitivity without contexts: A cut-shortcut approach to fast and precise pointer analysis. Proc. of the ACM on Programming Languages, 2023, 7(PLDI): 128.
    [14] Ryzhyk L, Budiu M. Differential Datalog. In: Proc. of the 3rd Int’l Workshop on the Resurgence of Datalog in Academia and Industry Co-located with the 15th Int’l Conf. on Logic Programming and Nonmonotonic Reasoning. Philadelphia: CEUR-WS.org, 2019. 56–67.
    [15] Jordan H, Scholz B, Subotić P. Soufflé: On synthesis of program analyzers. In: Proc. of the 28th Int’l Conf. on Computer Aided Verification. Toronto: Springer, 2016. 422–430. [doi: 10.1007/978-3-319-41540-6_23]
    [16] Zhao D, Subotic P, Raghothaman M, Scholz B. Towards elastic incrementalization for datalog. In: Proc. of the 23rd Int’l Symp. on Principles and Practice of Declarative Programming. New York: ACM, 2021. 20. [doi: 10.1145/3479394.3479415]
    [17] Ritsogianni AA. Incremental static analysis with differential Datalog [BS. Thesis]. Athens: University of Athens, 2019.
    [18] 谭添, 马晓星, 许畅, 马春燕, 李樾. Java指针分析综述. 计算机研究与发展, 2023, 60(2): 274–293.
    Tan T, Ma XX, Xu C, Ma CY, Li Y. Survey on Java pointer analysis. Journal of Computer Research and Development, 2023, 60(2): 274–293 (in Chinese with English abstract).
    [19] Whaley J, Avots D, Carbin M, Lam MS. Using Datalog with binary decision diagrams for program analysis. In: Proc. of the 3rd Asian Symp. on Programming Languages and Systems. Tsukuba: Springer, 2005. 97–118. [doi: 10.1007/11575467_8]
    [20] Antoniadis T, Triantafyllou K, Smaragdakis Y. Porting doop to Soufflé: A tale of inter-engine portability for Datalog-based analyses. In: Proc. of the 6th ACM SIGPLAN Int’l Workshop on State of the Art in Program Analysis. Barcelona: ACM, 2017. 25–30. [doi: 10.1145/3088515.3088522]
    [21] Use of Zipper in Doop. 2021. https://bitbucket.org/yanniss/doop/issues/41/use-of-zipper
    [22] Vallée-Rai R, Co P, Gagnon E, Hendren L, Lam P, Sundaresan V. Soot—A Java bytecode optimization framework. In: Proc. of the 1999 Conf. of the Centre for Advanced Studies on Collaborative research. Mississauga: IBM Press, 1999.
    [23] Lhoták O, Hendren L. Scaling Java points-to analysis using SPARK. In: Proc. of the 12th Int’l Conf. on Compiler Construction. Warsaw: Springer, 2003. 153–169. [doi: 10.1007/3-540-36579-6_12]
    [24] Lhoták O, Hendren L. Evaluating the benefits of context-sensitive points-to analysis using a BDD-based implementation. ACM Trans. on Software Engineering and Methodology, 2008, 18(1): 3.
    [25] WALA. Watson Libraries for Analysis (WALA). 2023. https://wala.sourceforge.net/
    [26] He DJ, Lu JB, Xue JL. Qilin: A new framework for supporting fine-grained context-sensitivity in Java pointer analysis. In: Proc. of the 36th European Conf. on Object-oriented Programming (ECOOP 2022). Berlin: Leibniz-Zentrum für Informatik, 2022. 30. [doi: 10.4230/LIPIcs.ECOOP.2022.30]
    [27] Tan T, Li Y. Tai-e: A developer-friendly static analysis framework for Java by harnessing the good designs of classics. In: Proc. of the 32nd ACM SIGSOFT Int’l Symp. on Software Testing and Analysis. Seattle: ACM, 2023. 1093–1105. [doi: 10.1145/3597926.3598120]
    [28] CodeQL. 2024. https://codeql.github.com/
    [29] Szabó T. Incrementalizing production CodeQL analyses. In: Proc. of the 31st ACM Joint European Software Engineering Conf. and Symp. on the Foundations of Software Engineering. San Francisco: ACM, 2023. 1716–1726. [doi: 10.1145/3611643.3613860]
    [30] Lu Y, Shang L, Xie XW, Xue JL. An incremental points-to analysis with CFL-reachability. In: Proc. of the 22nd Int’l Conf. on Compiler Construction. Rome: Springer, 2013. 61–81.
    [31] Arzt S, Bodden E. Reviser: Efficiently updating IDE-/IFDS-based data-flow analyses in response to incremental program changes. In: Proc. of the 36th Int’l Conf. on Software Engineering. Hyderabad: ACM, 2014. 288–298. [doi: 10.1145/2568225.2568243]
    [32] Pacak A, Erdweg S, Szabó T. A systematic approach to deriving incremental type checkers. Proc. of the ACM on Programming Languages, 2020, 4(OOPSLA): 127.
    [33] Zwaan A, Van Antwerpen H, Visser E. Incremental type-checking for free: Using scope graphs to derive incremental type-checkers. Proc. of the ACM on Programming Languages, 2022, 6(OOPSLA2): 140.
    [34] Zhao ZL, Wang XZ, Xu ZG, Tang ZH, Li YC, Di P. Incremental call graph construction in industrial practice. In: Proc. of the 45th IEEE/ACM Int’l Conf. on Software Engineering: Software Engineering in Practice (ICSE-SEIP). Melbourne: IEEE, 2023. 471–482. [doi: 10.1109/ICSE-SEIP58684.2023.00048]
    [35] Gupta A, Mumick IS, Subrahmanian VS. Maintaining views incrementally. ACM SIGMOD Record, 1993, 22(2): 157–166.
    [36] Szabó T, Erdweg S, Bergmann G. Incremental whole-program analysis in Datalog with lattices. In: Proc. of the 42nd ACM SIGPLAN Int’l Conf. on Programming Language Design and Implementation. New York, ACM, 2021. 1–15.
    [37] Demers A, Reps T, Teitelbaum T. Incremental evaluation for attribute grammars with application to syntax-directed editors. In: Proc. of the 8th ACM SIGPLAN-SIGACT Symp. on Principles of Programming Languages. Williamsburg: ACM, 1981. 105–116.
    [38] Pugh W, Teitelbaum T. Incremental computation via function caching. In: Proc. of the 16th ACM SIGPLAN-SIGACT Symp. on Principles of Programming Languages. Austin: ACM, 1989. 315–328. [doi: 10.1145/75277.75305]
    [39] Carlsson M. Monads for incremental computing. ACM SIGPLAN Notices, 2002, 37(9): 26–35.
    [40] Anderson D, Blelloch GE, Baweja A, Acar UA. Efficient parallel self-adjusting computation. In: Proc. of the 33rd ACM Symp. on Parallelism in Algorithms and Architectures. New York: ACM, 2021. 59–70.
    [41] Bhatotia P, Wieder A, Rodrigues R, Acar UA, Pasquin R. Incoop: MapReduce for incremental computations. In: Proc. of the 2nd ACM Symp. on Cloud Computing. Cascais: ACM, 2011. 7. [doi: 10.1145/2038916.2038923]
    [42] McSherry F, Murray DG, Isaacs R, Isard M. Differential dataflow. In: Proc. of the 6th Biennial Conf. on Innovative Data Systems Research. Asilomar: CIDR, 2013.
    [43] Fan WF, Liu MY, Tian C, Xu RQ, Zhou JR. Incrementalization of graph partitioning algorithms. Proc. of the VLDB Endowment, 2020, 13(8): 1261–1274.
    [44] Mariappan M, Vora K. GraphBolt: Dependency-driven synchronous processing of streaming graphs. In: Proc. of the 14th EuroSys Conf. Dresden: ACM, 2019. 25.
    [45] Han WT, Miao YS, Li KW, Wu M, Yang F, Zhou LD, Prabhakaran V, Chen WG, Chen EH. Chronos: A graph engine for temporal graph analysis. In: Proc. of the 9th European Conf. on Computer Systems. Amsterdam: ACM, 2014. 1.
    [46] Ju WY, Li JX, Yu WR, Zhang RC. iGraph: An incremental data processing system for dynamic graph. Frontiers of Computer Science, 2016, 10(3): 462–476.
    [47] Gu R, Zuo ZQ, Jiang X, Yin H, Wang ZK, Wang LZ, Li XD, Huang YH. Towards efficient large-scale interprocedural program static analysis on distributed data-parallel computation. IEEE Trans. on Parallel and Distributed Systems, 2021, 32(4): 867–883.
    Cited by
    Comments
    Comments
    分享到微博
    Submit
Get Citation

沈天琪,王熙灶,宾向荣,卜磊. DDoop: 基于差分式Datalog求解的增量指针分析框架.软件学报,2024,35(6):2608-2630

Copy
Share
Article Metrics
  • Abstract:772
  • PDF: 3092
  • HTML: 1513
  • Cited by: 0
History
  • Received:September 11,2023
  • Revised:October 30,2023
  • Online: January 05,2024
  • Published: June 06,2024
You are the first2032471Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063