Abstract:Fuzzing, as an automated software testing method, aims to detect potential security vulnerabilities, software defects, or abnormal behaviors by inputting a large quantity of automatically generated test data into the target software system. However, traditional fuzzing techniques are restricted by such factors as low automation level, low testing efficiency, and low code coverage, being unable to handle modern large-scale software systems. In recent years, the rapid development of large language models has not only brought significant breakthroughs to the field of natural language processing but also introduced new automation solutions to the field of fuzzing. Therefore, to better enhance the effectiveness of fuzzing technology, existing works have proposed various fuzzing methods combined with large language models, covering modules like test input generation, defect detection, and post-fuzzing. Nevertheless, the existing works lack systematic investigation and discussion on fuzzing techniques based on large language models. To fill the above-mentioned gaps in the review, this study comprehensively analyzes and summarizes the current research and development status of fuzzing techniques based on large language models. The main contents include (1) summarizing the overall process of fuzzing and the relevant technologies related to large language models commonly used in fuzzing research; (2) discussing the limitations of deep learning based fuzzing methods before the era of large language model (LLM); (3) analyzing the application methods of large language models in different stages of fuzzing; (4) exploring the main challenges and possible future development directions of large language model technology in fuzzing.