Survey on Visual Question Answering

doi:10.13328/j.cnki.jos.006215

微信服务号

微信订阅号

Home > Archive>Volume 32, Issue 8, 2021 >2522-2544. DOI:10.13328/j.cnki.jos.006215

PDF HTML XML Export Cite reminder

Survey on Visual Question Answering
DOI:
                        10.13328/j.cnki.jos.006215
                    
Author:
                        
                        
                    
Affiliation:
Clc Number:
Fund Project:National Natural Science Foundation of China (61772534, 61732006)

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

Visual question answering (VQA) is an interdisciplinary direction in the field of computer vision and natural language processing. It has received extensive attention in recent years. In the visual question answering, the algorithm is required to answer questions based on specific pictures (or videos). Since the first visual question answering dataset was released in 2014, several large-scale datasets have been released in the past five years, and a large number of algorithms have been proposed based on them. Existing research has focused on the development of visual question answering, but in recent years, visual question answering has been found to rely heavily on language bias and the distribution of datasets, especially since the release of the VQA-CP dataset, the accuracy of many models has been greatly reduced. This paper mainly introduces the proposed algorithms and the released datasets in recent years, especially discusses the research of algorithms on strengthening the robustness. The algorithms of visual question answering are summarized and their motivation, details, and limitations are also introduced. Finally, the challenge and prospect of visual question answering are discussed.

Reference

Cited by

Get Citation

包希港,周春来,肖克晶,覃飙.视觉问答研究综述.软件学报,2021,32(8):2522-2544

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:July 09,2020
Revised:October 02,2020
Adopted:
Online: January 15,2021
Published: August 06,2021

You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address：4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code：100190
Phone：010-62562563 Fax：010-62562533 Email：jos@iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063

微信服务号

微信订阅号

Get Citation

Share

Article Metrics

History