基于图像变换的双阈值对抗样本检测
CSTR:
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

TP309

基金项目:

国家资助博士后研究人员计划(GZC20230922); 中国博士后科学基金第75批面上项目(2024M751050); 华中师范大学中央高校基本科研业务费(CCNU24XJ001); 华中师范大学基本科研业务费(自然科学类)(CCNU24ai010)


Dual-threshold Adversarial Example Detection Based on Image Transformation
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    当前基于图像变换的对抗样本检测方法利用了图像变换对对抗样本的特征分布造成较大的影响, 而对于良性样本的特征分布影响较小这一特点, 通过计算样本变换前后的特征距离来检测对抗样本. 然而随着对抗攻击的深入研究, 研究者们更注重加强对抗攻击的鲁棒性, 使得一些攻击能“免疫”图像变换带来的影响. 现有方法难以有效地检测出鲁棒性强的对抗样本. 发现当前的对抗样本过于鲁棒, 强鲁棒性对抗样本在图像变换下的特征分布距离远小于良性样本的特征分布距离, 其特征分布距离违背了良性样本特征分布规律. 基于这一关键的发现, 提出基于图像变换的双阈值对抗样本检测方法, 在传统单阈值检测方法的基础上设置一个下阈值, 构成双阈值检测区间, 其特征分布距离不在区间范围的样本将被判定为对抗样本. 在VGG19、DenseNet和ConvNeXt图像分类模型中开展广泛的验证. 实验证明该方法能够有效兼容现有单阈值检测方案的检测能力, 同时对强鲁棒性对抗样本表现出良好的检测效果.

    Abstract:

    Existing adversarial example detection methods based on image transformation employ the characteristic that the image transformation can significantly change the feature distribution of adversarial examples but slightly change the feature distribution of benign examples. Adversarial examples can be detected by calculating the feature distance before and after image transformation. However, with the deepening research on adversarial attacks, researchers pay more attention to enhancing the robustness of adversarial examples, so that some attacks can be “immune” to the effect exerted by image transformation. Existing methods are difficult to detect robust adversarial examples effectively. This paper observes that the existing adversarial examples are too robust, and the feature distribution distance of robust adversarial examples under image transformation is much smaller than that of benign examples, which is not consistent with the feature distribution laws of benign examples. Based on this key observation, this study proposes a dual-threshold adversarial example detection based on image transformation, which sets a lower threshold combining existing single-threshold methods to form a dual-threshold detection interval. An example whose feature distribution is not within the dual-threshold detection interval will be judged as an adversarial example. Additionally, this study conducts extensive experiments on VGG19, DenseNet, and ConvNeXt models for image classification. The results show that the proposed approach is compatible with the detection ability of existing single-threshold detection schemes, and yields outstanding detection performance against robust adversarial examples.

    参考文献
    相似文献
    引证文献
引用本文

刘会,文福举,杜红琴,王敬华,赵波.基于图像变换的双阈值对抗样本检测.软件学报,,():1-16

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2024-05-06
  • 最后修改日期:2024-07-14
  • 录用日期:
  • 在线发布日期: 2025-01-24
  • 出版日期:
文章二维码
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号