谛听:一种面向鲁棒分布外样本检测的半监督对抗训练方法
DOI:
作者:
作者单位:

中国科学院软件研究所

作者简介:

通讯作者:

中图分类号:

基金项目:

61972386


DiTing: A Semi-Supervised Adversarial Training Framework for Robust Out-of-Distribution Detection
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    检测训练集分布之外的分布外(Out-Of-Distribution,OOD)样本对于深度神经网络(Deep Neural Network,DNN)分类器在开放环境的部署至关重要。检测OOD样本可以视为一种二分类问题,即把输入样本分类为“分布内(In-Distribution, ID)”类或“分布外”类。进一步地,检测器自身还可能遭受到恶意的对抗攻击而被再次绕过。这些带有恶意扰动的OOD样本称为对抗OOD样本。构建鲁棒的OOD检测器以检测对抗OOD样本是一项更具挑战性的任务。为了习得可分离且对恶意扰动鲁棒的表示,现有方法往往利用辅助的干净OOD样本邻域内的对抗OOD样本来训练DNN。然而,由于干净OOD样本与(干净)ID样本的分布差异,训练对抗OOD样本无法有效使分布内决策边界对对抗扰动足够鲁棒。从ID样本的邻域内生成的对抗ID样本是一种离分布内区域更近的分布外样本,对提升分布内决策边界对对抗扰动的鲁棒性很有效。基于此,本文提出一种半监督的对抗训练方法——谛听,来构建鲁棒的OOD检测器,用以同时检测干净OOD样本和对抗OOD样本。谛听将对抗ID样本视为一种辅助的非ID样本,并将其与其它辅助的干净OOD样本和对抗OOD样本联合训练DNN,以提升OOD检测器的鲁棒性。实验结果表明,谛听在检测由更强攻击生成的对抗OOD样本上具有显著的优势,同时在原主任务及检测干净OOD样本上保持先进的性能。开源地址:https://gitee.com/zhiyang3344/diting

    Abstract:

    Detecting Out-Of-Distribution (OOD) samples outside the training set distribution is crucial for deploying Deep Neural Network (DNN) classifiers in the open world. The detection of OOD samples is a binary classification problem, i.e., classifying the input samples into the "In-Distribution (ID)" or "Out-Of-Distribution" class. Further, the detector itself can be bypassed again by malicious adversarial attacks, and the OOD samples with malicious perturbations often refer to adversarial OOD samples. Building robust OOD detectors to detect adversarial OOD samples is a more challenging task. To learn more separable and robust representations against adversarial perturbations, existing methods usually train DNNs using adversarial OOD samples within the neighborhood of auxiliary clean OOD samples; however, due to the distributional differences between clean OOD samples and clean ID samples, training adversarial OOD samples is not effective to ensure the robustness of in-distribution decision boundary against adversarial perturbations. Adversarial ID samples generated from within the neighborhood of (clean) ID samples are closer to the in-distribution region and are effective in improving the adversarial robustness of the in-distribution decision boundary. In this paper, we propose a semi-supervised adversarial training approach, DiTing, to build robust OOD detectors to detect clean and adversarial OOD samples. DiTing treats the auxiliary adversarial ID samples as OOD samples and trains them jointly with other auxiliary clean and adversarial OOD samples to improve the robustness of OOD detectors. Experiments show that DiTing has a significant advantage in detecting adversarial OOD samples generated by strong attacks while maintaining state-of-the-art performance in classifying clean ID samples and detecting clean OOD samples. Code is available at: https://gitee.com/zhiyang3344/diting

    参考文献
    相似文献
    引证文献
引用本文
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2022-10-19
  • 最后修改日期:2023-02-03
  • 录用日期:2023-03-07
  • 在线发布日期:
  • 出版日期:
文章二维码
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号