Vulnerability Mining Method Based on Code Property Graph and Attention BiLSTM
Author:
Affiliation:

Clc Number:

Fund Project:

National Key Research and Development Program of China (2018YFB0803600); National Natural Science Foundation of China (61772507); Special Promotion of Industrial Technology Innovation Strategic Alliance of Beijing Municipal Science and Technology Commission (Z181100000518032)

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    With the increasingly serious trend of information security, software vulnerability has become one of the main threats to computer security. How to accurately mine vulnerabilities in the program is a key issue in the field of information security. However, existing static vulnerability mining methods have low accuracy when mining vulnerabilities with unobvious vulnerability features. On the one hand, rule-based methods by matching expert-defined code vulnerability patterns in target programs. Its predefined vulnerability pattern is rigid and single, which is unable to cover detailed features and result in problems of low accuracy and high false positives. On the other hand, learning-based methods cannot adequately model the features of the source code and cannot effectively capture the key feature, which makes it fail to accurately mine vulnerabilities with unobvious vulnerability features. To solve this issue, a source code level vulnerability mining method based on code property graph and attention BiLSTM is proposed. It firstly transforms the program source code to code property graph which contains semantic features, and performs program slicing to remove redundant information that is not related to sensitive operations. Then, it encodes the code property graph into the feature tensor with encoding algorithm. After that, a neural network based on BiLSTM and attention mechanism is trained using large-scale feature datasets. Finally, the trained neural network model is used to mine the vulnerabilities in the target program. Experimental results show that the F1 scores of the method reach 82.8%, 77.4%, 82.5%, and 78.0% respectively on the SARD buffer error dataset, SARD resource management error dataset, and their two subsets composed of C programs, which is significantly higher than the rule-based static mining tools Flawfinder and RATS and the learning-based program analysis model TBCNN.

    Reference
    Related
    Cited by
Get Citation

段旭,吴敬征,罗天悦,杨牧天,武延军.基于代码属性图及注意力双向LSTM的漏洞挖掘方法.软件学报,2020,31(11):3404-3420

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:July 08,2019
  • Revised:April 11,2020
  • Adopted:
  • Online: November 07,2020
  • Published: November 06,2020
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063