Abstract:With the rapid development of deep neural network (DNN), the accuracy of DNN has become comparable to or even surpassed that of humans in some specific tasks. However, like traditional software, DNN is inevitably prone to defects. If defective DNN models are applied to safety-critical fields, they may cause serious accidents. Therefore, it is urgent to propose effective methods to detect defective DNN models. The traditional differential testing methods rely on the output of the testing target at the same test input as the basis for difference analysis. However, even different DNN models trained with the same program and dataset may produce different outputs under the same test input. Therefore, it is difficult to directly use the traditional differential testing method for detecting defective DNN models. To solve the above problems, this study proposes interpretation-analysis-based differential testing (IADT), an interpretation-analysis-based differential testing method for DNN models. IADT uses interpretation methods to analyze the behavior explanation of DNN models and uses statistical methods to analyze the significant differences in the models’ behavior interpretations to detect defective models. Experiments carried out on real defective models show that the introduction of interpretation methods makes IADT effective in detecting defective DNN models, while the F1-value of IADT in detecting defective models is 0.8% –6.4% greater than that of DeepCrime, and the time consumed by IADT is only 4.0%–5.4% of DeepCrime.