基于稀疏扰动的对抗样本生成方法
作者:
作者单位:

作者简介:

吉顺慧(1987-),女,博士,副教授,CCF专业会员,主要研究领域为软件建模,测试与验证;胡黎明(1997-),男,硕士生,CCF学生会员,主要研究领域为人工智能软件测试;张鹏程(1981-),男,博士,博士生导师,CCF高级会员,主要研究领域为人工智能软件测试,服务计算,数据科学;戚荣志(1980-),男,博士,讲师,CCF专业会员,主要研究领域为实体识别,关系抽取,智能软件工程.

通讯作者:

张鹏程,E-mail:pchzhang@hhu.edu.cn

中图分类号:

基金项目:

国家自然科学基金(U21B2016, 61702159); 中央高校基本科研业务费专项资金(B220202072, B210202075); 江苏省自然科学基金(BK20191297, BK20170893)


Adversarial Example Generation Method Based on Sparse Perturbation
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    近年来, 深度神经网络(deep neural network, DNN)在图像领域取得了巨大的进展. 然而研究表明, DNN容易受到对抗样本的干扰, 表现出较差的鲁棒性. 通过生成对抗样本攻击DNN, 可以对DNN的鲁棒性进行评估, 进而采取相应的防御方法提高DNN的鲁棒性. 现有对抗样本生成方法依旧存在生成扰动稀疏性不足、扰动幅度过大等缺陷. 提出一种基于稀疏扰动的对抗样本生成方法——SparseAG (sparse perturbation based adversarial example generation), 该方法针对图像样本能够生成较为稀疏并且幅度较小的扰动. 具体来讲, SparseAG方法首先基于损失函数关于输入图像的梯度值迭代地选择扰动点来生成初始对抗样本, 每一次迭代按照梯度值由大到小的顺序确定新增扰动点的候选集, 选择使损失函数值最小的扰动添加到图像中. 其次, 针对初始扰动方案, 通过一种扰动优化策略来提高对抗样本的稀疏性和真实性, 基于每个扰动的重要性来改进扰动以跳出局部最优, 并进一步减少冗余扰动以及冗余扰动幅度. 选取CIFAR-10数据集以及ImageNet数据集, 在目标攻击以及非目标攻击两种场景下对该方法进行评估. 实验结果表明, SparseAG方法在不同的数据集以及不同的攻击场景下均能够达到100%的攻击成功率, 且生成扰动的稀疏性和整体扰动幅度都优于对比方法.

    Abstract:

    In recent years, deep neural network (DNN) has made great progress in the field of image. However, studies show that DNN is susceptible to the interference of adversarial examples and exhibits poor robustness. By generating adversarial examples to attack DNN, the robustness of DNN can be evaluated, and then corresponding defense methods can be adopted to improve the robustness of DNN. The existing adversarial example generation methods still have some defects, such as insufficient sparsity of generated perturbations, and excessive perturbation magnitude. This study proposes an adversarial example generation method based on sparse perturbation, sparse perturbation based adversarial example generation (SparseAG), which can generate relatively sparse and small-magnitude perturbations for image examples. Specifically, SparseAG first selects the perturbation points iteratively based on the gradient value of the loss function for the input image to generate the initial adversarial example. In each iteration, the candidate set of the new perturbation points is determined in the order of gradient value from large to small values, and the perturbation which makes the value of loss function value smallest is added to the image. Secondly, a perturbation optimization strategy is employed in the initial perturbation scheme to improve the sparsity and authenticity of the adversarial example. The perturbations are improved based on the importance of each perturbation for jumping out of the local optimum, and the redundant perturbation and the redundant perturbation magnitude are further reduced. This study selects the CIFAR-10 dataset and the ImageNet dataset to evaluate the method in the target attack and non-target attack scenarios. The experimental results show that SparseAG can achieve a 100% attack success rate in different datasets and different attack scenarios, and the sparsity and the overall perturbation magnitude of the generated perturbations are better than those of the comparison methods.

    参考文献
    相似文献
    引证文献
引用本文

吉顺慧,胡黎明,张鹏程,戚荣志.基于稀疏扰动的对抗样本生成方法.软件学报,2023,34(9):4003-4017

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2022-09-04
  • 最后修改日期:2022-10-13
  • 录用日期:
  • 在线发布日期: 2023-01-13
  • 出版日期: 2023-09-06
文章二维码
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号