基于语义调制的弱监督语义分割
CSTR:
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

TP391

基金项目:

国家重点研发计划(2022YFC2405600); 国家自然科学基金(62272235, 62102364, U21B2044); 浙江省自然科学基金(LY22F020016)


Semantic-modulation-based Weakly Supervised Semantic Segmentation
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    图像级标注下的弱监督语义分割方法通常采用卷积神经网络(CNN)生成类激活图以精确定位目标位置, 其面临的主要挑战在于CNN对全局信息感知能力的不足导致前景区域过小的问题. 近年来, 基于Transformer的弱监督语义分割方法利用自注意力机制捕捉全局依赖关系, 解决了CNN的固有缺陷. 然而, Transformer生成的初始类激活图会在目标区域周围引入大量背景噪声, 此时直接对初始类激活图进行使用并不能取得令人满意的效果. 通过综合利用Transformer生成的类与块间注意力(class-to-patch attention)以及区域块间注意力(patch-to-patch attention)对初始类激活图进行联合优化, 同时, 由于原始的类与块间注意力存在误差, 对此设计一种语义调制策略, 利用区域块间注意力的语义上下文信息对类与块间注意力进行调制, 修正其误差, 最终得到能够准确覆盖较多目标区域的类激活图. 在此基础上, 构建一种新颖的基于Transformer的弱监督语义分割模型. 所提方法在PASCAL VOC 2012验证集和测试集上mIoU值分别达到72.7%和71.9%, MS COCO 2014验证集上mIoU为42.3%, 取得了目前较为先进的弱监督语义分割结果.

    Abstract:

    Image-level weakly supervised semantic segmentation usually uses convolutional neural networks (CNNs) to generate class activation maps to accurately locate targets. However, CNNs have a limited capacity to perceive global information, which results in excessively narrow foregrounds. Recently, Transformer-based weakly supervised semantic segmentation has utilized self-attention mechanisms to capture global dependencies, addressing the inherent defects of CNNs. Nevertheless, the initial class activation map generated by a Transformer often introduces a lot of background noise around the target area, resulting in unsatisfactory performance if used directly. This study comprehensively utilizes both class-to-patch and patch-to-patch attention generated by a Transformer to optimize the initial class activation map. At the same time, a semantic modulation strategy is designed to correct errors in the class-to-patch attention, using the semantic context information of the patch-to-patch attention. Finally, a class activation map that accurately covers more target areas is obtained. On this basis, a novel model for weakly supervised semantic segmentation based on a Transformer is constructed. The mIoU of the proposed method reaches 72.7% and 71.9% on the PASCAL VOC 2012 validation and test sets, respectively, and 42.3% on the MS COCO 2014 validation set, demonstrating that the proposed method achieves improved performance in weakly supervised semantic segmentation.

    参考文献
    相似文献
    引证文献
引用本文

李军侠,苏京峰,崔滢,刘青山.基于语义调制的弱监督语义分割.软件学报,,():1-15

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2023-09-08
  • 最后修改日期:2024-01-11
  • 录用日期:
  • 在线发布日期: 2025-01-08
  • 出版日期:
文章二维码
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号