Abstract:Deep learning (DL) systems have powerful learning and reasoning capabilities and are widely employed in many fields including unmanned vehicles, speech recognition, intelligent robotics, etc. Due to the dataset limit and dependence on manually labeled data, DL systems are prone to unexpected behaviors. Accordingly, the quality of DL systems has received widespread attention in recent years, especially in safety-critical fields. Fuzz testing with strong fault-detecting ability is utilized to test DL systems, which becomes a research hotspot. This study summarizes existing fuzz testing for DL systems in the aspects of test case generation (including seed queue construction, seed selection, and seed mutation), test result determination, and coverage analysis. Additionally, commonly used datasets and metrics are introduced. Finally, the study prospects for the future development of this field.