Multi-modal Reliability-aware Affective Computing
Author:
Affiliation:

Clc Number:

TP18

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    Multi-modal affective computing is a fundamental and important research task in the field of affective computing, using multi-modal signals to understand the sentiment of user-generated video. Although existing multi-modal affective computing approaches have achieved good performance on benchmark datasets, they generally ignore the problem of modal reliability bias in multi-modal affective computing tasks, whether in designing complex fusion strategies or learning modal representations. This study believes that compared to text, acoustic and visual modalities often express sentiment more realistically. Therefore, voice and vision have high reliability, while text has low reliability in affective computing tasks. However, existing learning abilities of different modality feature extraction tools are different, resulting in a stronger ability to represent textual modality than acoustic and visual modalities (e.g., GPT3 and ResNet). This further exacerbates the problem of modal reliability bias, which is unfavorable for high-precision sentiment judgment. To mitigate the bias caused by modal reliability, this study proposes a model-agnostic multi-modal reliability-aware affective computing approach (MRA) based on cumulative learning. MRA captures the modal reliability bias by designing a single textual-modality branch and gradually shifting the focus from sentiments expressed in low-reliability textual modality to high-reliability acoustic and visual modalities during the model learning process. Thus, MRA effectively alleviates inaccurate sentiment predictions caused by low-reliability textual modality. Multiple comparative experiments conducted on multiple benchmark datasets demonstrate that the proposed approach MRA can effectively highlight the importance of high-reliability acoustic and visual modalities and mitigate the bias of low-reliability textual modality. Additionally, the model-agnostic approach significantly improves the performance of multi-modal affective computing, indicating its effectiveness and generality in multi-modal affective computing tasks.

    Reference
    Related
    Cited by
Get Citation

罗佳敏,王晶晶,周国栋.多模态可信度感知的情感计算.软件学报,,():1-17

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:April 03,2023
  • Revised:July 06,2023
  • Adopted:
  • Online: May 08,2024
  • Published:
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063