Abstract:Fuzzing is an automated software testing method that detects potential security vulnerabilities, software defects, or abnormal behavior by inputting a large amount of automatically generated test data into the target software system. However, traditional fuzzing techniques are limited by factors such as low automation, low testing efficiency, and low code coverage, and cannot cope with modern large-scale software systems. In recent years, the rapid development of large language models has not only brought significant breakthroughs to the field of natural language processing, but also brought new automation solutions to the field of fuzzy testing. Therefore, in order to better improve the effectiveness of fuzzing technology, existing work has proposed multiple fuzzing methods that combine large language models, covering modules such as test input generation, defect detection, and post-fuzzing. However, the existing work lacks systematic research and discussion on fuzzing techniques based on large language models. In order to fill the gaps mentioned above, this article comprehensively analyzes and summarizes the current research and development status of fuzzing techniques based on large language models. The main contents include: (1) summarizing the overall process of fuzzing and the relevant technologies related to large language models commonly used in fuzzing research; (2) discussing the limitations of deep learning based fuzzing methods before the era of LLM; (3) analyzing the application methods of large language models in different stages of fuzzing; (4) exploring the main challenges and possible future development directions of large language model technology in fuzzing.