Image Captioning Based on Visual Relevance and Context Dual Attention
Author:
Affiliation:

Clc Number:

TP391

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    Image captioning is of great theoretical significance and application value, which has attracted wide attention in computer vision and natural language processing. The existing attention mechanism-based image captioning methods integrate the current word and visual cues at the same moment to generate the target word, but they neglect the visual relevance and contextual information, which results in a difference between the generated caption and the ground truth. To address this problem, this paper presents the visual relevance and context dual attention (VRCDA) method. The visual relevance attention incorporates the attention vector of the previous moment into the traditional visual attention to ensure visual relevance, and the context attention is used to obtain much complete semantic information from the global context for better use of the context. In this way, the final image caption is generated via visual relevance and context information. The experiments on the MSCOCO and Flickr30k benchmark datasets demonstrate that VRCDA can effectively describe the image semantics, and compared with several state-of-the-art methods of image captioning, VRCDA can yield superior performance in all evaluation metrics.

    Reference
    Related
    Cited by
Get Citation

刘茂福,施琦,聂礼强.基于视觉关联与上下文双注意力的图像描述生成方法.软件学报,2022,33(9):3210-3222

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:June 03,2021
  • Revised:August 15,2021
  • Adopted:
  • Online: February 22,2022
  • Published: September 06,2022
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063