浙江省重点研发计划(2022C01018); 国家自然科学基金(U21B2001, 62102359)
深度神经网络是人工智能领域的一项重要技术, 它被广泛应用于各种图像分类任务. 但是, 现有的研究表明深度神经网络存在安全漏洞, 容易受到对抗样本的攻击, 而目前并没有研究针对图像对抗样本检测进行体系化分析. 为了提高深度神经网络的安全性, 针对现有的研究工作, 全面地介绍图像分类领域的对抗样本检测方法. 首先根据检测器的构建方式将检测方法分为有监督检测与无监督检测, 然后根据其检测原理进行子类划分. 最后总结对抗样本检测领域存在的问题, 在泛化性和轻量化等方面提出建议与展望, 旨在为人工智能安全研究提供帮助.
As an important technology in the field of artificial intelligence (AI), deep neural networks are widely used in various image classification tasks. However, existing studies have shown that deep neural networks have security vulnerabilities and are vulnerable to adversarial examples. At present, there is no research on the systematic analysis of adversarial example detection of images. To improve the security of deep neural networks, this study, based on the existing research work, comprehensively introduces adversarial example detection methods in the field of image classification. First, the detection methods are divided into supervised detection and unsupervised detection by the construction method of the detector, which are then classified into subclasses according to detection principles. Finally, the study summarizes the problems in adversarial example detection and provides suggestions and an outlook in terms of generalization and lightweight, aiming to assist in AI security research.