2022, 33(5):1587-1611.
DOI: 10.13328/j.cnki.jos.006550
Abstract:
With the rise of blockchain technology, more and more researchers and companies pay attention to the security of smart contracts. Currently, there are some studies on smart contract defect detection and testing techniques. Software defect prediction technology is an effective supplement to the defect detection techniques, which can optimize the allocation of testing resources and improve the efficiency of software testing. However, there is no research on software defect prediction for the smart contract. To address this problem, this study proposes a defect prediction method for Solidity smart contracts. First, it designs a metrics suite (smart contract-Solidity, SC-Sol) which considers the variables, functions, structures, and features of Solidity smart contracts, and SC-Sol is combined with the traditional metrics suite (code complexity and features of object-oriented program, COOP), which consider the object-oriented features, into COOP-SC-Sol metrics suite. Then, it extracts relevant metric meta-information from the Solidity code and performs defect detection to obtain the defects information to construct a Solidity smart contracts defect data set. On this basis, seven regression models and six classification models are applied to predict the defects of Solidity smart contracts to verify the performance differences of different metrics suites and different models for predicting the number and tendency of defects. Experimental results show that compared with the COOP, COOP-SC-Sol can improve the performance of the defect prediction model by 8% in terms of the F1-score. In addition, the problem of class imbalance in smart contract defect prediction is further studied. The result shows that the random under-sampling method can improve the performance of the defect prediction model by 9% in F1-score. In predicting the tendency of specific types of defects, the performance of the model is affected by the imbalance of data sets. Better performance is achieved in predicting the types of defects which the percentage of defect modules is greater than 10%.