双标签监督的几何约束对抗训练
作者:
作者简介:

曹刘娟(1983-),女,博士,副教授,CCF专业会员,主要研究领域为机器学习,模式识别;
张宝昌(1976-),男,博士,研究员,CCF专业会员,主要研究领域为视觉感知,边缘计算;
匡华峰(1994-),男,博士生,主要研究领域为对抗学习,机器学习;
黄飞跃(1979-),男,博士,高级工程师,专业会员,主要研究领域为机器学习与计算机视觉;
刘弘(1989-),男,博士,主要研究领域为机器学习,哈希检索;
吴永坚(1982-),男,博士,研究员,CCF专业会员,主要研究领域为机器学习,计算机视觉;
王言(1988-),男,博士,工程师,主要研究领域为多媒体内容检索,机器视觉;
纪荣嵘(1983-),男,博士,教授,博士生导师,CCF专业会员,主要研究领域为计算机视觉,机器学习.

通讯作者:

匡华峰,E-mail:skykuang@stu.xmu.edu.cn

基金项目:

国家杰出青年科学基金(62025603);国家自然科学基金(U1705262,62072386,62072387,62072389,62002305,61772443,61802324 61702136);广东省基础与应用基础研究基金(2019B1515120049);中央高校基本科研业务费(20720200077,20720200090,20720200091)


Towards Robust Adversarial Training via Dual-label Supervised and Geometry Constraint
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [33]
  • |
  • 相似文献 [20]
  • | | |
  • 文章评论
    摘要:

    近年来的研究表明,对抗训练是一种有效的防御对抗样本攻击的方法.然而,现有的对抗训练策略在提升模型鲁棒性的同时会造成模型的泛化能力下降.现阶段主流的对抗训练方法通常都是独立地处理每个训练样本,而忽略了样本之间的关系,这使得模型无法充分挖掘样本间的几何关系来学习更鲁棒的模型,以便更好地防御对抗攻击.因此,重点研究如何在对抗训练过程中保持样本间的几何结构稳定性,达到提升模型鲁棒性的目的.具体而言,在对抗训练中,设计了一种新的几何结构约束方法,其目的是保持自然样本与对抗样本的特征空间分布一致性.此外,提出了一种基于双标签的监督学习方法,该方法同时采用自然样本和对抗样本的标签对模型进行联合监督训练.最后,分析了双标签监督学习方法的特性,试图从理论上解释对抗样本的工作机理.多个基准数据集上的实验结果表明:相比于已有方法,该方法有效地提升了模型的鲁棒性且保持了较好的泛化精度.相关代码已经开源:https://github.com/SkyKuang/DGCAT.

    Abstract:

    Recent studies have shown that adversarial training is an effective method to defend against adversarial example attacks. However, such robustness comes with a price of a larger generalization gap. To this end, existing endeavors mainly treat each training example independently, which ignores the geometry relationship between inter-samples and does not take the defending capability to the full potential. Different from existing works, this study focuses on improving the robustness of the neural network model by aligning the geometric information of inter-samples to make the feature spatial distribution structure between the natural and adversarial samples is consistent. Furthermore, a dual-label supervised method is proposed to leverage true and wrong labels of adversarial example to jointly supervise the adversarial learning process. The characteristics of the dual-label supervised learning method are analyzed and it is tried to explain the working mechanism of the adversarial example theoretically. The extensive experiments have been conducted on benchmark datasets, which well demonstrates that the proposed approach effectively improves the robustness of the model and still keeps the generalization accuracy. Code is available: https://github.com/SkyKuang/DGCAT.

    参考文献
    [1] Krizhevsky A, Sutskever I, Hinton GE. Imagenet classification with deep convolutional neural networks. In: Proc. of the Advances in Neural Information Processing Systems, Vol.25. 2012. 1097-1105.
    [2] Hinton G, Deng L, Yu D, et al. Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Signal Processing Magazine, 2012, 29(6): 82-97.
    [3] Mnih V, Badia AP, Mirza M, et al. Asynchronous methods for deep reinforcement learning. In: Proc. of the Int’l Conf. on Machine Learning. PMLR, 2016. 1928-1937.
    [4] Szegedy C, Zaremba W, Sutskever I, et al. Intriguing properties of neural networks. In: Proc. of the Int’l Conf. on Learning Representations. 2014.
    [5] Moosavi-Dezfooli SM, Fawzi A, Frossard P. Deepfool: A simple and accurate method to fool deep neural networks. In: Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. 2016. 2574-2582.
    [6] Cisse M, Adi Y, Neverova N, et al. Houdini: Fooling deep structured prediction models. arXiv: 1707.05373, 2017.
    [7] Carlini N, Wagner D. Audio adversarial examples: Targeted attacks on speech-to-text. In: Proc. of the IEEE Security and Privacy Workshops (SPW). IEEE, 2018. 1-7.
    [8] Akhtar N, Mian A. Threat of adversarial attacks on deep learning in computer vision: A survey. IEEE Access, 2018, 6: 14410-14430.
    [9] Chakraborty A, Alam M, Dey V, et al. Adversarial attacks and defences: A survey. arXiv: 1810.00069, 2018.
    [10] Ji SL, Du TY, Li JF, et al. Security and privacy of machine learning models: A survey. Ruan Jian Xue Bao/Journal of Software, 2021, 32(1): 41-67 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/6131.htm [doi: 10.13328/j.cnki.jos. 006131]
    [11] Madry A, Makelov A, Schmidt L, et al. Towards deep learning models resistant to adversarial attacks. In: Proc. of the Int’l Conf. on Learning Representations. 2018.
    [12] Dhillon G, Azizzadenesheli K, Lipton Z, et al. Stochastic activation pruning for robust adversarial defense. In: Proc. of the Int’l Conf. on Learning Representations. 2018.
    [13] Yang Y, Zhang G, Katabi D, etal. ME-Net: Towards effective adversarial robustness with matrix estimation. In: Proc. of the Int’l Conf. on Machine Learning. PMLR, 2019. 7025-7034.
    [14] Song C, He K, Wang L, et al. Improving the generalization of adversarial training with domain adaptation. In: Proc. of the Int’l Conf. on Learning Representations. 2018.
    [15] Zhang H, Yu Y, Jiao J, et al. Theoretically principled trade-off between robustness and accuracy. In: Proc. of the Int’l Conf. on Machine Learning. PMLR, 2019. 7472-7482.
    [16] Goodfellow I, Shlens J, Szegedy C. Explaining and harnessing adversarial examples. In: Proc. of the Int’l Conf. on Learning Representations. 2015.
    [17] Ren S, He K, Girshick R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks. In: Proc. of the Advances in Neural Information Processing Systems, Vol.28. 2015. 91-99.
    [18] Ilyas A, Santurkar S, Engstrom L, et al. Adversarial examples are not bugs, they are features. In: Proc. of the Advances in Neural Information Processing Systems. 2019.
    [19] Wang J, Zhang H. Bilateral adversarial training: Towards fast training of more robust models against adversarial attacks. In: Proc. of the IEEE/CVF Int’l Conf. on Computer Vision. 2019. 6629-6638.
    [20] Tramèr F, Kurakin A, Papernot N, et al. Ensemble adversarial training: Attacks and defenses. In: Proc. of the Int’l Conf. on Learning Representations. 2018.
    [21] Wu L, Zhu Z, Tai C. Understanding and enhancing the transferability of adversarial examples. arXiv: 1802.09707, 2018.
    [22] Dong Y, Pang T, Su H, et al. Evading defenses to transferable adversarial examples by translation-invariant attacks. In: Proc. of the IEEE/CVF Conf. on Computer Vision and Pattern Recognition. 2019. 4312-4321.
    [23] Krizhevsky A, Hinton G. Learning multiple layers of features from tiny images. Technical Report, Citeseer, 2009.
    [24] Netzer, Y., Wang, T., Coates, A. et al. Reading digits in natural images with unsupervised feature learning. In: Proc. of the 2011 NIPS Workshop on Deep Learning and Unsupervised Feature Learning. 2011.
    [25] Zagoruyko S, Komodakis N. Wide residual networks. In: Proc. of the British Machine Vision Conf. 2016. British Machine Vision Association, 2016.
    [26] Papernot N, McDaniel P, Wu X, et al. Distillation as a defense to adversarial perturbations against deep neural networks. In: Proc. of the IEEE Symp. on Security and Privacy (SP). IEEE, 2016. 582-597.
    [27] Samangouei P, Kabkab M, Chellappa R. Defense-GAN: Protecting classifiers against adversarial attacks using generative models. In: Proc. of the Int’l Conf. on Learning Representations. 2018.
    [28] Guo C, Rana M, Cisse M, et al. Countering adversarial images using input transformations. In: Proc. of the Int’l Conf. on Learning Representations. 2018.
    [29] Raghunathan A, Steinhardt J, Liang P. Semidefinite relaxations for certifying robustness to adversarial examples. In: Proc. of the Int’l Conf. on Neural Information Processing Systems. 2018. 10900-10910.
    [30] Athalye A, Carlini N, Wagner D. Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. In: Proc. of the Int’l Conf. on Machine Learning. PMLR, 2018. 274-283.
    [31] Athalye A, Carlini N. On the robustness of the cvpr 2018 white-box adversarial example defenses. arXiv: 1804.03286, 2018.
    附中文参考文献:
    [10] 纪守领, 杜天宇, 李进锋, 等. 机器学习模型安全与隐私研究综述. 软件学报, 2021, 32(1): 41-67. http://www.jos.org.cn/1000-9825/6131.htm [doi: 10.13328/j.cnki.jos.006131]
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

曹刘娟,匡华峰,刘弘,王言,张宝昌,黄飞跃,吴永坚,纪荣嵘.双标签监督的几何约束对抗训练.软件学报,2022,33(4):1218-1230

复制
分享
文章指标
  • 点击次数:1272
  • 下载次数: 4297
  • HTML阅读次数: 3046
  • 引用次数: 0
历史
  • 收稿日期:2021-05-30
  • 最后修改日期:2021-07-16
  • 在线发布日期: 2021-10-26
  • 出版日期: 2022-04-06
文章二维码
您是第19727649位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号