基于软件度量的Solidity智能合约缺陷预测方法
作者:
通讯作者:

崔展齐,E-mail:czq@bistu.edu.cn

中图分类号:

TP311

基金项目:

江苏省前沿引领技术基础研究专项(BK202002001); 国家自然科学基金(61702041); 北京信息科技大学“勤信人才”培育计划(QXTCP C201906)


Defect Prediction for Solidity Smart Contracts Based on Software Measurement
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [70]
  • |
  • 相似文献 [20]
  • | | |
  • 文章评论
    摘要:

    随着区块链技术的兴起, 智能合约安全问题被越来越多的研究者和企业重视, 目前已有一些针对智能合约缺陷检测技术的研究. 软件缺陷预测技术是软件缺陷检测技术的有效补充, 能够优化测试资源分配, 提高软件测试效率. 然而, 目前还没有针对智能合约的软件缺陷预测研究. 针对这一问题, 提出了面向Solidity智能合约的缺陷预测方法. 首先, 设计了一组针对Solidity智能合约特有的变量、函数、结构和Solidity语言特性的度量元集(smart contract-Solidity, SC-Sol度量元集), 并将其与重点考虑面向对象特征的度量元集(code complexity and features of object-oriented program, COOP度量元集)组合为COOP-SC-Sol度量元集. 然后, 从Solidity智能合约代码中提取相关度量元信息, 并结合缺陷检测结果, 构建Solidity智能合约缺陷数据集. 在此基础上, 应用了7种回归模型和6种分类模型进行Solidity智能合约的缺陷预测, 以验证不同度量元集和不同模型在缺陷数量和倾向性预测上的性能差异. 实验结果表明, 相对于COOP度量元集, COOP-SC-Sol能够让缺陷预测模型的F1-score指标提升8%. 此外, 进一步研究了智能合约缺陷预测中的类不平衡问题, 实验结果表明, 通过采样技术对数据集进行预处理能够提升缺陷预测模型的性能, 其中随机欠采样技术能够使模型的F1-score指标提升9%. 在特定缺陷倾向性预测问题上, 模型的预测性能受到数据集类不平衡的影响, 在缺陷模块百分比大于10%的数据集中能取得较好的预测性能.

    Abstract:

    With the rise of blockchain technology, more and more researchers and companies pay attention to the security of smart contracts. Currently, there are some studies on smart contract defect detection and testing techniques. Software defect prediction technology is an effective supplement to the defect detection techniques, which can optimize the allocation of testing resources and improve the efficiency of software testing. However, there is no research on software defect prediction for the smart contract. To address this problem, this study proposes a defect prediction method for Solidity smart contracts. First, it designs a metrics suite (smart contract-Solidity, SC-Sol) which considers the variables, functions, structures, and features of Solidity smart contracts, and SC-Sol is combined with the traditional metrics suite (code complexity and features of object-oriented program, COOP), which consider the object-oriented features, into COOP-SC-Sol metrics suite. Then, it extracts relevant metric meta-information from the Solidity code and performs defect detection to obtain the defects information to construct a Solidity smart contracts defect data set. On this basis, seven regression models and six classification models are applied to predict the defects of Solidity smart contracts to verify the performance differences of different metrics suites and different models for predicting the number and tendency of defects. Experimental results show that compared with the COOP, COOP-SC-Sol can improve the performance of the defect prediction model by 8% in terms of the F1-score. In addition, the problem of class imbalance in smart contract defect prediction is further studied. The result shows that the random under-sampling method can improve the performance of the defect prediction model by 9% in F1-score. In predicting the tendency of specific types of defects, the performance of the model is affected by the imbalance of data sets. Better performance is achieved in predicting the types of defects which the percentage of defect modules is greater than 10%.

    参考文献
    [1] 袁勇, 王飞跃. 区块链技术发展现状与展望. 自动化学报, 2016, 42(4): 481-494. [doi: 10.16383/j.aas.2016.c160158]
    Yuan Y, Wang FY. Blockchain: The state of the art and future trends. Acta Automatica Sinica, 2016, 42(4): 481–494 (in Chinese with English abstract). [doi: 10.16383/j.aas.2016.c160158]
    [2] 邵奇峰, 金澈清, 张召, 钱卫宁, 周傲英. 区块链技术: 架构及进展. 计算机学报, 2018, 41(5): 969–988.
    Shao QF, Jin CQ, Zhang Z, Qian WN, Zhou AY. Blockchain: Architecture and research progress. Chinese Journal of Computers, 2018, 41(5): 969–988 (in Chinese with English abstract).
    [3] Szabo N. Formalizing and securing relationships on public networks. First Monday, 1997, 2(9): 1–21.
    [4] 欧阳丽炜, 王帅, 袁勇, 倪晓春, 王飞跃. 智能合约: 架构及进展. 自动化学报, 2019, 45(3): 445–457. [doi: 10.16383/j.aas.c180586]
    Ouyang LW, Wang S, Yuan Y, Ni XC, Wang FY. Smart contracts: Architecture and research progresses. Acta Automatica Sinica, 2019, 45(3): 445–457 (in Chinese with English abstract). [doi: 10.16383/j.aas.c180586]
    [5] 贺海武, 延安, 陈泽华. 基于区块链的智能合约技术与应用综述. 计算机研究与发展, 2018, 55(11): 2452–2466. [doi: 10.7544/issn1000-1239.2018.20170658]
    He HW, Yan A, Chen ZH. Survey of smart contract technology and application based on blockchain. Journal of Computer Research and Development, 2018, 55(11): 2452–2466 (in Chinese with English abstract). [doi: 10.7544/issn1000-1239.2018.20170658]
    [6] Wang S, Ni XC, Yuan Y, Wang FY, Wang X, Ouyang LW. A preliminary research of prediction markets based on blockchain powered smart contracts. In: Proc. of the 2018 IEEE Int’l Conf. on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData). Halifax: IEEE, 2018. 1287–1293.
    [7] Maicher L, de la Rosa JL, Gibovic D, Torres-Padrosa V. On intellectual property in online open innovation for SME by means of blockchain and smart contracts. In: Proc. of the World Open Innovation Conf. 2016. Barcelona, 2016.
    [8] Azaria A, Ekblaw A, Vieira A, Lippman A. MedRec: Using blockchain for medical data access and permission management. In: Proc. of the 2nd Int’l Conf. on Open and Big Data (OBD). Vienna: IEEE, 2016. 25–30.
    [9] Dorri A, Kanhere SS, Jurdak R. Towards an optimized blockchain for IoT. In: Proc. of the 2nd IEEE/ACM Int’l Conf. on Internet-of-Things Design and Implementation (IoTDI). Pittsburgh: IEEE, 2017. 173–178.
    [10] Luu L, Chu DH, Olickel H, Saxena P, Hobor A. Making smart contracts smarter. In: Proc. of the 2016 ACM SIGSAC Conf. on Computer and Communications Security. Vienna: ACM, 2016. 254–269.
    [11] Chen JC, Xia X, Lo D, Grundy J, Luo XP, Chen T. Defining smart contract defects on Ethereum. IEEE Transactions on Software Engineering, 2022, 48(1): 327–345. [doi: 10.1109/TSE.2020.2989002]
    [12] Liu J, Liu ZT. A survey on security verification of blockchain smart contracts. IEEE Access, 2019, 7: 77894–77904. [doi: 10.1109/ACCESS.2019.2921624]
    [13] Marino B, Juels A. Setting standards for altering and undoing smart contracts. In: Proc. of the 10th Int’l Symp. on Rules and Rule Markup Languages for the Semantic Web. Stony Brook: Springer, 2016. 151–166.
    [14] Nguyen TD, Pham LH, Sun J, Lin Y, Minh QT. sFuzz: An efficient adaptive fuzzer for solidity smart contracts. In: Proc. of the 42nd ACM/IEEE Int’l Conf. on Software Engineering (ICSE). Seoul: ACM, 2020. 778–788.
    [15] 陈翔, 顾庆, 刘望舒, 刘树龙, 倪超. 静态软件缺陷预测方法研究. 软件学报, 2016, 27(1): 1-25. http://www.jos.org.cn/1000-9825/4923.htm
    Chen X, Gu Q, Liu WS, Liu SL, Ni C. Survey of static software defect prediction. Ruan Jian Xue Bao/Journal of Software, 2016, 27(1): 1-25 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/4923.htm
    [16] Chen X, Zhang D, Zhao YQ, Cui ZQ, Ni C. Software defect number prediction: Unsupervised vs supervised methods. Information and Software Technology, 2019, 106: 161–181. [doi: 10.1016/j.infsof.2018.10.003]
    [17] Gong LN, Jiang SJ, Wang RC, Jiang L. Empirical evaluation of the impact of class overlap on software defect prediction. In: Proc. of the 34th IEEE/ACM Int’l Conf. on Automated Software Engineering (ASE). San Diego: IEEE, 2019. 698–709.
    [18] Bennin KE, Keung J, Phannachitta P, Monden A, Mensah S. MAHAKIL: Diversity based oversampling approach to alleviate the class imbalance issue in software defect prediction. IEEE Transactions on Software Engineering, 2018, 44(6): 534–550. [doi: 10.1109/TSE.2017.2731766]
    [19] Jiang B, Liu Y, Chan WK. ContractFuzzer: Fuzzing smart contracts for vulnerability detection. In: Proc. of the 33rd IEEE/ACM Int’l Conf. on Automated Software Engineering (ASE). Montpellier: IEEE, 2018. 259–269.
    [20] Nikolić I, Kolluri A, Sergey I, Saxena P, Hobor A. Finding the greedy, prodigal, and suicidal contracts at scale. In: Proc. of the 34th Annual Computer Security Applications Conf. San Juan: ACM, 2018. 653–663.
    [21] Jureczko M, Spinellis DD. Using object-oriented design metrics to predict software defects. In: Proc. of the 5th Int’l Conf. on Dependability of Computer Systems DepCoS. Wrocław: Oficyna Wydawnicza Politechniki Wrocławskiej, 2010. 69–81.
    [22] Jureczko M, Madeyski L. Towards identifying software project clusters with regard to defect prediction. In: Proc. of the 6th Int’l Conf. on Predictive Models in Software Engineering. Timişoara: ACM, 2010. 9.
    [23] Chidamber SR, Kemerer CF. A metrics suite for object oriented design. IEEE Transactions on Software Engineering, 1994, 20(6): 476–493. [doi: 10.1109/32.295895]
    [24] Tang MH, Kao MH, Chen MH. An empirical study on object-oriented metrics. In: Proc. of the 6th Int’l Software Metrics Symposium. Boca Raton: IEEE, 1999. 242–249.
    [25] Bansiya J, Davis CG. A hierarchical model for object-oriented design quality assessment. IEEE Transactions on Software Engineering, 2002, 28(1): 4–17. [doi: 10.1109/32.979986]
    [26] Sellers H. Object-oriented Metrics: Measures of Complexity. Upper Saddle River: Prentice Hall, 1996.
    [27] Martin R. OO design quality metrics. An Analysis of Dependencies, 1994, 12(1): 151–170. (查阅所有网上资料, 未找到本条文献信息, 请联系作者确认)
    [28] McCabe TJ. A complexity measure. IEEE Transactions on Software Engineering, 1976, SE-2(4): 308–320. [doi: 10.1109/TSE.1976.233837]
    [29] Atzei N, Bartoletti M, Cimoli T. A survey of attacks on Ethereum smart contracts (SoK). In: Proc. of the 6th Int’l Conf. on Principles of Security and Trust. Uppsala: Springer, 2017. 164–186.
    [30] Chang JL, Gao B, Xiao H, Sun J, Cai Y, Yang ZJ. sCompile: Critical path identification and analysis for smart contracts. In: Proc. of the 21st Int’l Conf. on Formal Engineering Methods. Shenzhen: Springer, 2019. 286–304.
    [31] 钱鹏, 刘振广, 何钦铭, 黄步添, 田端正, 王勋. 智能合约安全漏洞检测技术研究综述. 软件学报, 2021. http://www.jos.org.cn/1000-9825/6375.htm
    Qian P, Liu ZG, He QM, Huang BT, Tian DZ, Wang X. Smart contract vulnerability detection technique: A survey. Ruan Jian Xue Bao/Journal of Software, 2021 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/6375.htm
    [32] Tikhomirov S, Voskresenskaya E, Ivanitskiy I, Takhaviev R, Marchenko E, Alexandrov Y. SmartCheck: Static analysis of Ethereum smart contracts. In: Proc. of the 1st IEEE/ACM Int’l Workshop on Emerging Trends in Software Engineering for Blockchain (WETSEB). Gothenburg: IEEE, 2018. 9–16.
    [33] Zhou EC, Hua S, Pi BF, Sun J, Nomura Y, Yamashita K, Kurihara H. Security assurance for smart contract. In: Proc. of the 9th IFIP Int’l Conf. on New Technologies, Mobility and Security (NTMS). Paris: IEEE, 2018. 1–5.
    [34] 宫丽娜, 姜淑娟, 姜丽. 软件缺陷预测技术研究进展. 软件学报, 2019, 30(10): 3090–3114. http://www.jos.org.cn/1000-9825/5790.htm
    Gong LN, Jiang SJ, Jiang L. Research progress of software defect prediction. Ruan Jian Xue Bao/Journal of Software, 2019, 30(10): 3090–3114 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/5790.htm
    [35] Chen L, Fang B, Shang ZW, Tang YY. Tackling class overlap and imbalance problems in software defect prediction. Software Quality Journal, 2018, 26(1): 97–125. [doi: 10.1007/s11219-016-9342-6]
    [36] Cabral GG, Minku LL, Shihab E, Mujahid S. Class imbalance evolution and verification latency in just-in-time software defect prediction. In: Proc. of the 41st IEEE/ACM Int’l Conf. on Software Engineering (ICSE). Montreal: IEEE, 2019. 666–676.
    [37] Yatish S, Jiarpakdee J, Thongtanunam P, Tantithamthavorn C. Mining software defects: Should we consider affected releases? In: Proc. of the 41st IEEE/ACM Int’l Conf. on Software Engineering (ICSE). Montreal: IEEE, 2019. 654–665.
    [38] Weyuker EJ, Ostrand TJ, Bell RM. Comparing the effectiveness of several modeling methods for fault prediction. Empirical Software Engineering, 2010, 15(3): 277–295. [doi: 10.1007/s10664-009-9111-2]
    [39] Van Hulse J, Khoshgoftaar TM, Napolitano A. Experimental perspectives on learning from imbalanced data. In: Proc. of the 24th Int’l Conf. on Machine Learning. Corvalis: ACM, 2007. 935–942.
    [40] Kondo M, Oliva GA, Jiang ZM, Hassan AE, Mizuno O. Code cloning in smart contracts: A case study on verified contracts from the Ethereum blockchain platform. Empirical Software Engineering, 2020, 25(6): 4617–4675. [doi: 10.1007/s10664-020-09852-5]
    [41] 张健, 张超, 玄跻峰, 熊英飞, 王千祥, 梁彬, 李炼, 窦文生, 陈振邦, 陈立前, 蔡彦. 程序分析研究进展. 软件学报, 2019, 30(1): 80–109. http://www.jos.org.cn/1000-9825/5651.htm
    Zhang J, Zhang C, Xuan JF, Xiong YF, Wang QX, Liang B, Li L, Dou WS, Chen ZB, Chen LQ, Cai Y. Recent progress in program analysis. Ruan Jian Xue Bao/Journal of Software, 2019, 30(1): 80–109 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/5651.htm
    [42] 李舟军, 张俊贤, 廖湘科, 马金鑫. 软件安全漏洞检测技术. 计算机学报, 2015, 38(4): 717–732. [doi: 10.3724/SP.J.1016.2015.00717]
    Li ZJ, Zhang JX, Liao XK, Ma JX. Survey of software vulnerability detection techniques. Chinese Journal of Computers, 2015, 38(4): 717–732 (in Chinese with English abstract). [doi: 10.3724/SP.J.1016.2015.00717]
    [43] 陈翔, 王莉萍, 顾庆, 王赞, 倪超, 刘望舒, 王秋萍. 跨项目软件缺陷预测方法研究综述. 计算机学报, 2018, 41(1): 254–274. [doi: 10.11897/SP.J.1016.2018.00254]
    Chen X, Wang LP, Gu Q, Wang Z, Ni C, Liu WS, Wang QP. A survey on cross-project software defect prediction methods. Chinese Journal of Computers, 2018, 41(1): 254–274 (in Chinese with English abstract). [doi: 10.11897/SP.J.1016.2018.00254]
    [44] Liu C, Liu H, Cao Z, Chen Z, Chen BD, Roscoe B. ReGuard: Finding reentrancy bugs in smart contracts. In: Proc. of the 40th IEEE/ACM Int’l Conf. on Software Engineering: Companion (ICSE-Companion). Gothenburg: IEEE, 2018. 65–68.
    [45] Krupp J, Rossow C. TeEther: Gnawing at Ethereum to automatically exploit smart contracts. In: Proc. of the 27th USENIX Security Symp. Baltimore: USENIX Association, 2018. 1317–1333.
    [46] Torres CF, Schütte J, State R. Osiris: Hunting for integer bugs in Ethereum smart contracts. In: Proc. of the 34th Annual Computer Security Applications Conf. San Juan: ACM, 2018. 664–676.
    [47] Radjenović D, Heričko M, Torkar R, Živkovič A. Software fault prediction metrics: A systematic literature review. Information and Software Technology, 2013, 55(8): 1397–1418. [doi: 10.1016/j.infsof.2013.02.009]
    [48] He ZM, Shu FD, Yang Y, Li MS, Wang Q. An investigation on the feasibility of cross-project defect prediction. Automated Software Engineering, 2012, 19(2): 167–199. [doi: 10.1007/s10515-011-0090-3]
    [49] Pelayo L, Dick S. Applying novel resampling strategies to software defect prediction. In: Proc. of the 2007 Annual Meeting of the North American Fuzzy Information Processing Society. San Diego: IEEE, 2007. 69–72.
    [50] Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP. SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 2002, 16(1): 321–357.
    [51] 于巧, 姜淑娟, 张艳梅, 王兴亚, 高鹏飞, 钱俊彦. 分类不平衡对软件缺陷预测模型性能的影响研究. 计算机学报, 2018, 41(4): 809–824. [doi: 10.11897/SP.J.1016.2018.00809]
    Yu Q, Jiang SJ, Zhang YM, Wang XY, Gao PF, Qian JY. The impact study of class imbalance on the performance of software defect prediction models. Chinese Journal of Computers, 2018, 41(4): 809–824 (in Chinese with English abstract). [doi: 10.11897/SP.J.1016.2018.00809]
    [52] Zhang F, Mockus A, Keivanloo I, Zhou Y. Towards building a universal defect prediction model. In: Proc. of the 11th Working Conf. on Mining Software Repositories. Hyderabad: ACM, 2014. 182–191.
    [53] Shivaji S, Whitehead EJ, Akella R, Kim S. Reducing features to improve code change-based bug prediction. IEEE Transactions on Software Engineering, 2013, 39(4): 552–569. [doi: 10.1109/TSE.2012.43]
    [54] 张献, 贲可荣, 曾杰. 基于代码自然性的切片粒度缺陷预测方法. 软件学报, 2021, 32(7): 2219–2241. http://www.jos.org.cn/1000-9825/6261.htm
    Zhang X, Ben KR, Zeng J. Code naturalness based defect prediction method at slice level. Ruan Jian Xue Bao/Journal of Software, 2021, 32(7): 2219–2241 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/6261.htm
    [55] Dam HK, Pham T, Ng SW, Tran T, Grundy J, Ghose A, Kim T, Kim CJ. Lessons learned from using a deep tree-based model for software defect prediction in practice. In: Proc. of the 16th IEEE/ACM Int’l Conf. on Mining Software Repositories (MSR). Montreal: IEEE, 2019. 46–57.
    [56] 李政亮, 陈翔, 蒋智威, 顾庆. 基于信息检索的软件缺陷定位方法综述. 软件学报, 2021, 32(2): 247-276. http://www.jos.org.cn/1000-9825/6130.htm
    Li ZL, Chen X, Jiang ZW, Gu Q. Survey on information retrieval-based software bug localization methods. Ruan Jian Xue Bao/Journal of Software, 2021, 32(2): 247-276 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/6130.htm
    [57] Wen M, Wu RX, Cheung SC. Locus: Locating bugs from software changes. In: Proc. of the 31st IEEE/ACM Int’l Conf. on Automated Software Engineering. Singapore: IEEE, 2016. 262-273.
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

杨慧文,崔展齐,陈翔,贾明华,郑丽伟,刘建宾.基于软件度量的Solidity智能合约缺陷预测方法.软件学报,2022,33(5):1587-1611

复制
分享
文章指标
  • 点击次数:1850
  • 下载次数: 5353
  • HTML阅读次数: 3487
  • 引用次数: 0
历史
  • 收稿日期:2021-08-08
  • 最后修改日期:2021-10-09
  • 在线发布日期: 2022-01-28
  • 出版日期: 2022-05-06
文章二维码
您是第19710080位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号