一种基于进化策略和注意力机制的黑盒对抗攻击算法
作者:
作者简介:

黄立峰(1990-),男,博士生,CCF学生会员,主要研究领域为对抗学习,自主感知定位.
廖泳贤(1996-),女,硕士生,主要研究领域为对抗训练,计算机视觉.
庄文梓(1997-),男,硕士生,主要研究领域为对抗训练,计算机视觉.
刘宁(1973-),男,博士,教授,博士生导师,CCF专业会员,主要研究领域为对抗学习,自主感知定位.

通讯作者:

刘宁,E-mail:liuning2@mail.sysu.edu.cn

中图分类号:

TP18

基金项目:

国家自然科学基金(61772567);中央高校基本科研业务费专项资金(19lgjc11)


Black-box Adversarial Attack Method Based on Evolution Strategy and Attention Mechanism
Author:
Fund Project:

National Natural Science Foundation of China (61772567); Fundamental Research Funds for the Central Universities (19lgjc11)

  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [50]
  • |
  • 相似文献 [20]
  • | | |
  • 文章评论
    摘要:

    深度神经网络在许多计算机视觉任务中都取得了优异的结果,并在不同领域中得到了广泛应用.然而研究发现,在面临对抗样本攻击时,深度神经网络表现得较为脆弱,严重威胁着各类系统的安全性.在现有的对抗样本攻击中,由于黑盒攻击具有模型不可知性质和查询限制等约束,更接近实际的攻击场景.但现有的黑盒攻击方法存在攻击效率较低与隐蔽性弱的缺陷,因此提出了一种基于进化策略的黑盒对抗攻击方法.该方法充分考虑了攻击过程中梯度更新方向的分布关系,自适应学习较优的搜索路径,提升攻击的效率.在成功攻击的基础上,结合注意力机制,基于类间激活热力图将扰动向量分组和压缩优化,减少在黑盒攻击过程中积累的冗余扰动,增强优化后的对抗样本的不可感知性.通过与其他4种最新的黑盒对抗攻击方法(AutoZOOM、QL-attack、FD-attak、D-based attack)在7种深度神经网络上进行对比,验证了该方法的有效性与鲁棒性.

    Abstract:

    Since deep neural networks (DNNs) have provided state-of-the-art results for different computer vision tasks, they are utilized as the basic backbones to be employed in many domains. Nevertheless, DNNs have been demonstrated to be vulnerable to adversarial attacks in recent researches, which will threaten the security of different DNN-based systems. Compared with white-box adversarial attacks, black-box attacks are more similar to the realistic scenarios under the constraints like lacking knowledge of model and limited queries. However, existing methods under black-box scenarios not only require a large amount of model queries, but also are perceptible from human vision system. To address these issues, this study proposes a novel method based on evolution strategy, which improves the attack performance by considering the inherent distribution of updated gradient direction. It helps the proposed method in sampling effective solutions with higher probabilities as well as learning better searching paths. In order to make generated adversarial example less perceptible and reduce the redundant perturbations after a successful attacking, the proposed method utilizes class activation mapping to group the perturbations by introducing the attention mechanism, and then compresses the noise group by group while ensure that the generated images can still fool the target model. Extensive experiments on seven DNNs with different structures suggest the superiority of the proposed method compared with the state-of-the-art black-box adversarial attack approaches (i.e., AutoZOOM, QL-attack, FD-attack, and D-based attack).

    参考文献
    [1] Niu L, Veeraraghavan A, Sabharwal A. Webly supervised learning meets zero-shot learning:A hybrid approach for fine-grained classification. In:Proc. of the Conf. on Computer Vision and Pattern Recognition. IEEE, 2018. 7171-7180.
    [2] Huang JP, Shi YH, Gao Y. Multi-scale Faster-RCNN algorithm for small object detection. Journal of Computer Research and Development, 2019,56(2):319-327(in Chinese with English abstract).
    [3] Huang L,Yang Y, Wang QJ, Guo F, Gao Y. Indoor scene segmentation based on fully convolutional neural networks. Journal of Image and Graphics, 2019,24(1):64-72(in Chinese with English abstract).
    [4] Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation. IEEE Trans. on Pattern Analysis & Machine Intelligence, 2014,39(4):640-651.
    [5] Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems, 2012,25:1097-1105.
    [6] Simonyan K, Zisserman A, Very deep convolutional networks for large-scale image recognition. In:Proc. of the Int'l Conf. on Learning Representations. 2015. 1-14.
    [7] Szegedy C, Vanhoucke V, Ioffe S, et al. Rethinking the inception architecture for computer vision. In:Proc. of the Conf. on Computer Vision and Pattern Recognition. IEEE, 2016. 2818-2826.
    [8] He K, Zhang X, Ren S, et al. Deep residual learning for image recognition. In:Proc. of the Conf. on Computer Vision and Pattern Recognition. IEEE, 2016. 770-778.
    [9] Song M, Zhong K, Zhang J, et al. In-situ AI:Towards autonomous and incremental deep learning for IoT systems. In:Proc. of the Int'l Symp. on High Performance Computer Architecture (HPCA). IEEE, 2018. 92-103.
    [10] Wang Y, Huang XD, Guo ST. Indoor fingerprint location algorithm based on convolutional neural network. Ruan Jian Xue Bao/Journal of Software, 2018,29:63-72(in Chinese with English abstract). http://www.jos.org.cn/1000-9825/18007.htm
    [11] Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow I, Fergus R. Intriguing properties of neural networks. In:Proc. of the Int'l Conf. on Learning Representations. 2014. 1-10.
    [12] Esteva A, Kuprel B, Novoa RA, et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature, 2017, 542(7639):115-118.
    [13] Ma YK, Wu LF, Jian M, Liu FH, Yang Z. Algorithm to generate adversarial examples for face-spoofing detection. Ruan Jian Xue Bao/Journal of Software, 2019,30(2):469-480(in Chinese with English abstract). http://www.jos.org.cn/1000-9825/5568.htm[doi:10.13328/j.cnki.jos.005568]
    [14] Wang WQ, Wang R, Wang LN, Tang BX. Adversarial examples generation approach for tendency classification on Chinese texts. Ruan Jian Xue Bao/Journal of Software, 2019,30(8):2415-2427(in Chinese with English abstract). http://www.jos.org.cn/1000-9825/5765.htm[doi:10.13328/j.cnki.jos.005765]
    [15] Sharif M, Bhagavatula S, Bauer L, et al. Accessorize to a crime:Real and stealthy attacks on state-of-the-art face recognition. In:Proc. of the ACM SIGSAC Conf. on Computer and Communications Security. ACM, 2016. 1528-1540.
    [16] Athalye A, Engstrom L, Ilyas A, et al. Synthesizing robust adversarial examples. In:Proc. of the Int'l Conf. on Machine Learning. 2018. 284-293.
    [17] Eykholt K, Evtimov I, Fernandes E, et al. Robust physical-world attacks on deep learning visual classification. In:Proc. of the Conf. on Computer Vision and Pattern Recognition. IEEE, 2018. 1625-1634.
    [18] Thys S, Van Ranst W, Goedeme T. Fooling automated surveillance cameras:Adversarial patches to attack person detection. In:Proc. of the Conf. on Computer Vision and Pattern Recognition Workshops (CVPRW). IEEE, 2019. 1-7.
    [19] Goodfellow IJ, Shlens J, Szegedy S. Explaining and harnessing adversarial examples. In:Proc. of the Computer Science. 2014. 1-11.
    [20] Kurakin A, Goodfellow IJ, Bengio S. Adversarial examples in the physical world. In:Proc. of the Artificial Intelligence Safety and Security. Chapman and Hall/CRC, 2018. 99-112.
    [21] Papernot N, McDaniel P, Jha S, Fredrikson M, Celik Z, Swami A. The limitations of deep learning in adversarial settings. In:Proc. of the IEEE European Symp. on Security and Privacy. IEEE, 2016. 372-387.
    [22] Carlini N, Wagner D. Towards evaluating the robustness of neural networks. In:Proc. of the IEEE Symp. on Security and Privacy. IEEE, 2017. 39-57.
    [23] Papernot N, McDaniel P, Goodfellow I, Jha S, Celik Z, Swami A. Practical black-box attacks against machine learning. In:Proc. of the ACM Asia Conf. on Computer and Communications Security. ACM, 2017. 506-519.
    [24] Dong Y, Pang T, Su H, et al. Evading defenses to transferable adversarial examples by translation-invariant attacks. In:Proc. of the Conf. on Computer Vision and Pattern Recognition. IEEE, 2019. 4312-4321.
    [25] Zhou W, Hou X, Chen Y, et al. Transferable adversarial perturbations. In:Proc. of the European Conf. on Computer Vision (ECCV). 2018. 452-467.
    [26] Bhagoji AN, He W, Li B, et al. Practical black-box attacks on deep neural networks using efficient query mechanisms. In:Proc. of the European Conf. on Computer Vision. Cham:Springer-Verlag, 2018. 158-174.
    [27] Chen PY, Zhang H, Sharma Y, et al. Zoo:Zeroth order optimization based black-box attacks to deep neural networks without training substitute models. In:Proc. of the 10th ACM Workshop on Artificial Intelligence and Security. ACM, 2017. 15-26.
    [28] Tu CC, Ting P, Chen PY, et al. AutoZOOM:Autoencoder-based zeroth order optimization method for attacking black-box neural networks. In:Proc. of the AAAI Conf. on Artificial Intelligence, Vol.33. 2019. 742-749.
    [29] Ilyas A, Engstrom L, Athalye A, Lin J. Black-box adversarial attacks with limited queries and information. In Proc. of the 35th Int'l Conf. on Machine Learning. 2018. 2137-2146.
    [30] Su J, Vargas DV, Sakurai K. One pixel attack for fooling deep neural networks. IEEE Trans. on Evolutionary Computation, 2019, 23(5):828-841.
    [31] Moosavi-Dezfooli SM, Fawzi A, Frossard P. Deepfool:A simple and accurate method to fool deep neural networks. In:Proc. of the Conf. on Computer Vision and Pattern Recognition. IEEE, 2016. 2574-2582.
    [32] Moosavi-Dezfooli SM, Fawzi A, Fawzi O, et al. Universal adversarial perturbations. In:Proc. of the Conf. on Computer Vision and Pattern Recognition. 2017. 1765-1773.
    [33] Brendel W, Rauber J, Bethge M. Decision-based adversarial attacks:Reliable attacks against black-box machine learning models. arXiv preprint arXiv:1712.04248, 2017.
    [34] Dong Y, Su H, Wu B, et al. Efficient decision-based black-box adversarial attacks on face recognition. In:Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition. IEEE, 2019. 7714-7722.
    [35] Zhou B, Khosla A, Lapedriza A, et al. Learning deep features for discriminative localization. In:Proc. of the Conf. on Computer Vision and Pattern Recognition. IEEE, 2016. 2921-2929.
    [36] Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D. Grad-CAM:Visual explanations from deep networks via gradient-based localization. In:Proc. of the IEEE Int'l Conf. on Computer Vision, Vol.7. 2017. 618-626.
    [37] Liu A, Liu X, Fan J, et al. Perceptual-sensitive GAN for generating adversarial patches. In:Proc. of the AAAI Conf. on Artificial Intelligence, Vol.33. 2019. 1028-1035.
    [38] Hansen N, Ostermeier A. Completely derandomized self-adaptation in evolution strategies. Evolutionary Computation, 2001,9(2):159-195.
    [39] Dong Y, Liao F, Pang T, et al. Boosting adversarial attacks with momentum. In:Proc. of the Conf. on Computer Vision and Pattern Recognition. IEEE, 2018. 9185-9193.
    [40] Shi Y, Wang S, Han Y. Curls & Whey:Boosting black-box adversarial attacks. In:Proc. of the Conf. on Computer Vision and Pattern Recognition. IEEE, 2019. 6519-6527.
    [41] Deng J, Dong W, Socher R, et al. ImageNet:A large-scale hierarchical image database. In:Proc. of the Conf. on Computer Vision and Pattern Recognition. IEEE, 2009. 248-255.
    [42] Florian T, Alexey K, Nicolas P, Dan B, Patrick M. Ensemble adversarial training:Attacks and defenses. In:Proc. of the Int'l Conf. on Learning Representations. 2018. 1-12.
    [43] Xie C, Yuille A. Intriguing properties of adversarial training at scale. In:Proc. of the Int'l Conf. on Learning Representations. 2020. 1-14.
    [44] Madry A, Makelov A, Schmidt L, Tsipras D, Vladu A. Towards deep learning models resistant to adversarial attacks. In:Proc. of the Int'l Conf. on Learning Representations. 2018. 1-28.
    附中文参考文献:
    [2] 黄继鹏,史颖欢,高阳.面向小目标的多尺度Faster-RCNN检测算法.计算机研究与发展,2019,56(2):319-327.
    [3] 黄龙,杨媛,王庆军,郭飞,高勇.结合全卷积神经网络的室内场景分割.中国图像图形学报,2019,24(1):64-72.
    [10] 王英,黄旭东,郭松涛.基于卷积神经网络的室内指纹定位算法.软件学报,2018,29:63-72. http://www.jos.org.cn/1000-9825/18007.htm
    [13] 马玉琨,毋立芳,简萌,刘方昊,杨洲.一种面向人脸活体检测的对抗样本生成算法.软件学报,2019,30(2):469-480. http://www.jos. org.cn/1000-9825/5568.htm[doi:10.13328/j.cnki.jos.005568]
    [14] 王文琦,汪润,王丽娜,唐奔宵.面向中文文本倾向性分类的对抗样本生成方法.软件学报,2019,30(8):2415-2427. http://www.jos.org.cn/1000-9825/5765.htm[doi:10.13328/j.cnki.jos.005765]
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

黄立峰,庄文梓,廖泳贤,刘宁.一种基于进化策略和注意力机制的黑盒对抗攻击算法.软件学报,2021,32(11):3512-3529

复制
分享
文章指标
  • 点击次数:1366
  • 下载次数: 4055
  • HTML阅读次数: 1604
  • 引用次数: 0
历史
  • 收稿日期:2019-09-29
  • 最后修改日期:2020-04-02
  • 在线发布日期: 2021-11-05
  • 出版日期: 2021-11-06
文章二维码
您是第19778866位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号