基于数据流传播路径学习的智能合约时间戳漏洞检测
作者:
基金项目:

国家重点研发计划(2021YFB1714200);中国博士后科学基金(2023M732594)


Detection of Smart Contract Timestamp Vulnerability Based on Data-flow Path Learning
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [64]
  • |
  • 相似文献 [20]
  • | | |
  • 文章评论
    摘要:

    智能合约是一种被大量部署在区块链上的去中心化的应用. 由于其具有经济属性, 智能合约漏洞会造成潜在的巨大经济和财产损失, 并破坏以太坊的稳定生态. 因此, 智能合约的漏洞检测具有十分重要的意义. 当前主流的智能合约漏洞检测方法(诸如Oyente和Securify)采用基于人工设计的启发式算法, 在不同应用场景下的复用性较弱且耗时高, 准确率也不高. 为了提升漏洞检测效果, 针对智能合约的时间戳漏洞, 提出基于数据流传播路径学习的智能合约漏洞检测方法Scruple. 所提方法首先获取时间戳漏洞的潜在的数据传播路径, 然后对其进行裁剪并利用融入图结构的预训练模型对传播路径进行学习, 最后对智能合约是否具有时间戳漏洞进行检测. 相比而言, Scruple具有更强的漏洞捕捉能力和泛化能力, 传播路径学习的针对性强, 避免了对程序整体依赖图学习时造成的层次太深而无法聚焦漏洞的问题. 为了验证Scruple的有效性, 在真实智能合约的数据集上, 开展Scruple方法与13种主流智能合约漏洞检测方法的对比实验. 实验结果表明, Scruple在检测时间戳漏洞上的准确率, 召回率和F1值分别可以达到0.96, 0.90和0.93, 与13种当前主流方法相比, 平均相对提升59%, 46%和57%, 从而大幅提升时间戳漏洞的检测能力.

    Abstract:

    The smart contract is a decentralized application widely deployed on the blockchain platform, e.g., Ethereum. Due to the economic attributes, the vulnerabilities in smart contracts can potentially cause huge financial losses and destroy the stable ecology of Ethereum. Thus, it is crucial to detect the vulnerabilities in smart contracts before they are deployed to Ethereum. The existing smart contract vulnerability detection methods (e.g., Oyente and Secure) are mostly based on heuristic algorithms. The reusability of these methods is weak in different application scenarios. In addition, they are time-consuming and with low accuracy. In order to improve the effectiveness of vulnerability detection, this study proposes Scruple: a smart contract timestamp vulnerability detection approach based on learning data-flow path. It first obtains all possible propagation chains of timestamp vulnerabilities, then refines the propagation chains, uses a graph pre-training model to learn the relationship in the propagation chains, and finally detects whether a smart contract has timestamp vulnerabilities using the learned model. Compared with the existing detection methods, Scruple has a stronger vulnerability capture ability and generalization ability. Meanwhile, learning the propagation chain is not only well-directed but also can avoid an unnecessarily deep hierarchy of programs for the convergence of vulnerabilities. To verify the effectiveness of Scruple, this study uses real-world distinct smart contracts to compare Scruple with 13 state-of-the-art smart contract vulnerability detection methods. The experimental results show that Scruple can achieve 96% accuracy, 90% recall, and 93% F1-score in detecting timestamp vulnerabilities. In other words, the average improvement of Scruple over 13 methods using the three metrics is 59%, 46%, and 57% respectively. It means that Scruple has substantially improved in detecting timestamp vulnerabilities.

    参考文献
    [1] Szabo N. Smart contracts:Building blocks for digital markets. EXTROPY:The Journal of Transhumanist Thought, 1996, (16):18.
    [2] Nakamoto S. Bitcoin:A peer-to-peer electronic cash system. 2008. https://bitcoin.org/bitcoin.pdf
    [3] Buterin V. Understanding serenity, Part 2:Casper. 2013. https://blog.ethereum.org/2015/12/28/understanding-serenity-part-2-casper
    [4] Dannen C. Introducing Ethereum and Solidity. Apress:Springer, 2017.
    [5] Chen WL, Ma MJ, Ye YJ, Zheng ZB, Zhou YR. IoT service based on jointcloud blockchain:The case study of smart traveling. In:Proc. of the 2018 IEEE Symp. on Service-oriented System Engineering (SOSE). Bamberg:IEEE, 2018. 216-221.
    [6] Velner Y, Teutsch J, Luu L. Smart contracts make bitcoin mining pools vulnerable. In:Proc. of the 2017 Int'l Conf. on Financial Cryptography and Data Security. Sliema:Springer, 2017. 298-316.
    [7] Chen JC, Xia X, Lo D, Grundy J, Luo XP, Chen T. Defining smart contract defects on ethereum. IEEE Transactions on Software Engineering, 2022, 48(2):327-345.[doi:10.1109/TSE.2020.2989002]
    [8] del Castillo M. The DAO attacked:Code issue leads to $60 million ether theft. Saatavissa (viitattu 13.2. 2017). 2016. https://github.com/jaswalabhijeet/Documents-Blockchain/blob/master/The%20DAO%20Attacked:%20Code%20Issue%20Leads%20to%20%2460%20Million%20Ether%20Theft%20-%20CoinDesk.pdf
    [9] Nabilou H. How to regulate bitcoin? decentralized regulation for a decentralized cryptocurrency. Int'l Journal of Law and Information Technology, 2019, 27(3):266-291.[doi:10.1093/ijlit/eaz008]
    [10] Atzei N, Bartoletti M, Cimoli T. A survey of attacks on Ethereum smart contracts (SoK). In:Proc. of the 6th Int'l Conf. on Principles of Security and Trust. Uppsala:Springer, 2017. 164-186.
    [11] Tikhomirov S, Voskresenskaya E, Ivanitskiy I, Takhaviev R, Marchenko E, Alexandrov Y. SmartCheck:Static analysis of Ethereum smart contracts. In:Proc. of the 1st IEEE/ACM Int'l Workshop on Emerging Trends in Software Engineering for Blockchain. Gothenburg:IEEE, 2018. 9-16.
    [12] Feist J, Grieco G, Groce A. Slither:A static analysis framework for smart contracts. In:Proc. of the 2nd IEEE/ACM Int'l Workshop on Emerging Trends in Software Engineering for Blockchain (WETSEB). Montreal:IEEE, 2019. 8-15.
    [13] Kalra S, Goel S, Dhawan M, Sharma S. ZEUS:Analyzing safety of smart contracts. In:Proc. of the 2018 Network and Distributed Systems Security (NDSS) Symp. San Diego, 2018. 1-12.
    [14] Tsankov P, Dan A, Drachsler-Cohen D, Gervais A, Bünzli F, Vechev M. Securify:Practical security analysis of smart contracts. In:Proc. of the 2018 ACM SIGSAC Conf. on Computer and Communications Security. Toronto:ACM, 2018. 67-82.
    [15] Jiang B, Liu Y, Chan WK. Contractfuzzer:Fuzzing smart contracts for vulnerability detection. In:Proc. of the 33rd IEEE/ACM Int'l Conf. on Automated Software Engineering (ASE). Montpellier:IEEE, 2018. 259-269.
    [16] Luu L, Chu D H, Olickel H, Saxena P, Hobor A. Making smart contracts smarter. In:Proc. of the 2016 ACM SIGSAC Conf. on Computer and Communications Security. Vienna:ACM, 2016. 254-269.
    [17] Torres C F, Schütte J, State R. Osiris:Hunting for integer bugs in Ethereum smart contracts. In:Proc. of the 34th Annual Computer Security Applications Conf. San Juan:ACM, 2018. 664-676.
    [18] Mueller B. Mythril-reversing and bug hunting framework for the Ethereum blockchain. 2017. https://pypi.org/project/mythril/0.8.2
    [19] Mossberg M, Manzano F, Hennenfent E, Groce A, Grieco G, Feist J, Brunson T, Dinaburg A. Manticore:A user-friendly symbolic execution framework for binaries and smart contracts. In:Proc. of the 34th IEEE/ACM Int'l Conf. on Automated Software Engineering (ASE). San Diego:IEEE, 2019. 1186-1189.
    [20] Gao ZP, Jayasundara V, Jiang LX, Xia X, Lo D, Grundy J. SmartEmbed:A tool for clone and bug detection in smart contracts through structural code embedding. In:Proc. of the 2019 IEEE Int'l Conf. on Software Maintenance and Evolution (ICSME). Cleveland:IEEE, 2019. 394-397.
    [21] Zhuang Y, Liu ZG, Qian P, Liu Q, Wang X, He Q. Smart contract vulnerability detection using graph neural networks. In:Proc. of the 29th Int'l Joint Conf. on Artificial Intelligence. Yokohama:Unknown Publishers, 2020. 454.
    [22] Zhang T, Xu BW, Thung F, Haryono SA, Lo D, Jiang LX. Sentiment analysis for software engineering:How far can pre-trained transformer models go? In:Proc. of the 2020 IEEE Int'l Conf. on Software Maintenance and Evolution (ICSME). Adelaide:IEEE, 2020. 70-80.
    [23] Liu F, Li G, Zhao YF, Jin Z. Multi-task learning based pre-trained language model for code completion. In:Proc. of the 35th IEEE/ACM Int'l Conf. on Automated Software Engineering (ASE). Melbourne:IEEE, 2020. 473-485.
    [24] Robbes R, Janes A. Leveraging small software engineering data sets with pre-trained neural networks. In:Proc. of the 41st IEEE/ACM Int'l Conf. on Software Engineering:New Ideas and Emerging Results (ICSE-NIER). Montreal:IEEE, 2019. 29-32.
    [25] Ferreira JF, Cruz P, Durieux T, Abreu R. Smartbugs:A framework to analyze Solidity smart contracts. In:Proc. of the 35th IEEE/ACM Int'l Conf. on Automated Software Engineering (ASE). Melbourne:IEEE, 2020. 1349-1352.
    [26] Wang W, Song JJ, Xu GQ, Li YD, Wang H, Su CH. ContractWard:Automated vulnerability detection models for ethereum smart contracts. IEEE Transactions on Network Science and Engineering, 2021, 8(2):1133-1144.[doi:10.1109/TNSE.2020.2968505]
    [27] Wood G. Ethereum:A secure decentralised generalised transaction ledger. Ethereum Project Yellow Paper, 2014. https://cryptodeep.ru/doc/paper.pdf
    [28] Zhang PC, Xiao F, Luo XP. A framework and dataset for bugs in Ethereum smart contracts. In:Proc. of the 2020 IEEE Int'l Conf. on Software Maintenance and Evolution (ICSME). Adelaide:IEEE, 2020. 139-150.
    [29] 倪远东, 张超, 殷婷婷. 智能合约安全漏洞研究综述. 信息安全学报, 2020, 5(3):78-99.[doi:10.19363/J.cnki.cn10-1380/tn.2020.05.07]
    Ni YD, Zhang C, Yin TT. A survey of smart contract vulnerability research. Journal of Cyber Security, 2020, 5(3):78-99 (in Chinese with English abstract).[doi:10.19363/J.cnki.cn10-1380/tn.2020.05.07]
    [30] Bach LM, Mihaljevic B, Zagar M. Comparative analysis of blockchain consensus algorithms. In:Proc. of the 41st Int'l Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO). Opatija:IEEE, 2018. 1545-1550.
    [31] Peters ME, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L. Deep contextualized word representations. In:Proc. of the 2018 Conf. of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies. New Orleans:ACL, 2018. 2227-2237.
    [32] Radford A, Narasimhan K, Salimans T, Sutskever I. Improving language understanding by generative pre-training. 2018. https://paperswithcode.com/paper/improving-language-understanding-by
    [33] Devlin J, Chang MW, Lee K, Toutanova K. BERT:Pre-training of deep bidirectional transformers for language understanding. In:Proc. of the 2019 Conf. of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies. Minneapolis:ACL, 2019. 4171-4186.
    [34] Kanade A, Maniatis P, Balakrishnan G, Shi K. Learning and evaluating contextual embedding of source code. arXiv:2001.00059, 2019.
    [35] Karampatsis RM, Sutton C. SCELMo:Source code embeddings from language models. arXiv:2004.13214, 2020.
    [36] Feng ZY, Guo DY, Tang DY, Duan N, Feng XC, Gong M, Shou LJ, Qin B, Liu T, Jiang DX, Zhou M. CodeBERT:A pre-trained model for programming and natural languages. In:Proc. of the 2020 Findings of the Association for Computational Linguistics. ACL, 2020. 1536-1547.
    [37] Svyatkovskiy A, Deng SK, Fu SY, Sundaresan N. Intellicode compose:Code generation using transformer. In:Proc. of the 28th ACM Joint Meeting on European Software Engineering Conf. and Symp. on the Foundations of Software Engineering. ACM, 2020. 1433-1443.
    [38] Buratti L, Pujar S, Bornea M, McCarley S, Zheng YH, Rossiello G, Morari A, Laredo J, Thost V, Zhuang YF, Domeniconi G. Exploring software naturalness through neural language models. arXiv:2006.12641, 2020.
    [39] Guo DY, Ren S, Lu S, Feng ZY, Tang DY, Liu SJ, Zhou L, Duan N, Svyatkovskiy A, Fu SY, Tufano M, Deng SK, Clement CB, Drain D, Sundaresan N, Yin J, Jiang DX, Zhou M. GraphCodeBERT:Pre-training code representations with data flow. In:Proc. of the 9th Int'l Conf. on Learning Representations. ICLR, 2021.
    [40] Allamanis M, Brockschmidt M, Khademi M. Learning to represent programs with graphs. In:Proc. of the 6th Int'l Conf. on Learning Representations. Vancouver:ICLR, 2018.
    [41] Hellendoorn VJ, Sutton C, Singh R, Maniatis P, Bieber D. Global relational models of source code. In:Proc. of the 8th Int'l Conf. on Learning Representations. Addis Ababa:ICLR, 2019.
    [42] Guo AB, Mao XG, Yang DH, Wang SW. An empirical study on the effect of dynamic slicing on automated program repair efficiency. In:Proc. of the 2018 IEEE Int'l Conf. on Software Maintenance and Evolution (ICSME). Madrid:IEEE, 2018. 554-558.
    [43] Lua T. tree-sitter. 2023. https://tree-sitter.github.io/tree-sitter/
    [44] Honig J. tree-sitter-solidity. 2023. https://pypi.org/project/tree-sitter-solidity/
    [45] Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I. Attention is all you need. In:Proc. of the 31st Int'l Conf. on Neural Information Processing Systems. Long Beach:Curran Associates Inc., 2017. 6000-6010.
    [46] Zhang Z, Lei Y, Mao XG, Li PP. CNN-FL:An effective approach for localizing faults using convolutional neural networks. In:Proc. of the 26th Int'l Conf. on Software Analysis, Evolution and Reengineering (SANER). Hangzhou:IEEE, 2019. 445-455.
    [47] Liu ZG, Qian P, Wang XY, Zhuang Y, Qiu L, Wang X. Combining graph neural networks with expert knowledge for smart contract vulnerability detection. IEEE Transactions on Knowledge and Data Engineering, 2023, 35(2):1296-1310.[doi:10.1109/TKDE.2021.3095196]
    [48] Zhang Z, Lei Y, Yan M, Yu Y, Chen JC, Wang SW, Mao XG. Reentrancy vulnerability detection and localization:A deep learning based two-phase approach. In:Proc. of the 37th IEEE/ACM Int'l Conf. on Automated Software Engineering (ASE). IEEE, 2022. 1-13.
    [49] Zhou EC, Hua S, Pi BF, Sun J, Nomura Y, Yamashita K, Kurihara H. Security assurance for smart contract. In:Proc. of the 9th IFIP Int'l Conf. on New Technologies, Mobility and Security (NTMS). Paris:IEEE, 2018. 1-5.
    [50] Permenev A, Dimitrov D, Tsankov P, Drachsler-Cohen D, Vechev M. VerX:Safety verification of smart contracts. In:Proc. of the 2020 IEEE Symp. on Security and Privacy (SP). San Francisco:IEEE, 2020. 1661-1677.
    [51] Grieco G, Song W, Cygan A, Feist J, Groce A. Echidna:Effective, usable, and fast fuzzing for smart contracts. In:Proc. of the 29th ACM SIGSOFT Int'l Symp. on Software Testing and Analysis. ACM, 2020. 557-560.
    [52] He JX, Balunović M, Ambroladze N, Tsankov P, Vechev M. Learning to fuzz from symbolic execution with application to smart contracts. In:Proc. of the 2019 ACM SIGSAC Conf. on Computer and Communications Security. London:ACM, 2019. 531-548.
    [53] Wustholz V, Christakis M. Harvey:A greybox fuzzer for smart contracts. In:Proc. of the 28th ACM Joint Meeting on European Software Engineering Conf. and the Symp. on the Foundations of Software Engineering. ACM, 2020. 1398-1409.
    [54] Rodler M, Li WT, Karame GO, Davi L. Sereum:Protecting existing smart contracts against re-entrancy attacks. In:Proc. of the 26th Annual Network and Distributed System Security Symp. San Diego:NDSS, 2018.
    [55] Torres CF, Steichen M, State R. The art of the scam:Demystifying honeypots in Ethereum smart contracts. In:Proc. of the 28th USENIX Conf. on Security Symp. Santa Clara:USENIX Association, 2019. 1591-1607.
    [56] Liu H, Liu C, Zhao WQ, Jiang Y, Sun JG. S-gram:Towards semantic-aware security auditing for Ethereum smart contracts. In:Proc. of the 33rd IEEE/ACM Int'l Conf. on Automated Software Engineering (ASE). Montpellier:IEEE, 2018. 814-819.
    [57] Huang TTHD. Hunting the Ethereum smart contract:Color-inspired inspection of potential attacks. arXiv:1807.01868, 2018.
    [58] Tann WJW, Han XJ, Gupta SS, Ong YS. Towards safer smart contracts:A sequence learning approach to detecting security threats. arXiv:1811.06632, 2018.
    [59] Pradel M, Sen K. DeepBugs:A learning approach to name-based bug detection. Proceedings of the ACM on Programming Languages, 2018, 2:147.[doi:10.1145/3276517]
    [60] Li X, Li W, Zhang YQ, Zhang LM. DeepFL:Integrating multiple fault diagnosis dimensions for deep fault localization. In:Proc. of the 28th ACM SIGSOFT Int'l Symp. on Software Testing and Analysis. Beijing:ACM, 2019. 169-180.
    [61] Zhang Z, Lei Y, Mao XG, Yan M, Xu L, Zhang XH. A study of effectiveness of deep learning in locating real faults. Information and Software Technology, 2021, 131:106486.[doi:10.1016/j.infsof.2020.106486]
    [62] Lam AN, Nguyen AT, Nguyen HA, Nguyen TN. Combining deep learning with information retrieval to localize buggy files for bug reports (N). In:Proc. of the 30th IEEE/ACM Int'l Conf. on Automated Software Engineering (ASE). Lincoln:IEEE, 2015. 476-481.
    [63] Li Y, Wang SH, Nguyen T. Fault localization with code coverage representation learning. In:Proc. of the 43rd IEEE/ACM Int'l Conf. on Software Engineering (ICSE). Madrid:IEEE, 2021. 661-673.
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

张卓,刘业鹏,薛建新,鄢萌,陈嘉弛,毛晓光.基于数据流传播路径学习的智能合约时间戳漏洞检测.软件学报,2024,35(5):2325-2339

复制
分享
文章指标
  • 点击次数:516
  • 下载次数: 1864
  • HTML阅读次数: 908
  • 引用次数: 0
历史
  • 收稿日期:2022-07-03
  • 最后修改日期:2023-02-13
  • 在线发布日期: 2023-11-08
  • 出版日期: 2024-05-06
文章二维码
您是第19708246位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号