Hybrid Data Augmentation Framework Based on Controllable Explanation

doi:10.13328/j.cnki.jos.007215

微信服务号

微信订阅号

2025-5-2- 11

Home > Archive>Volume 36, Issue 4, 2025 >1604-1619. DOI:10.13328/j.cnki.jos.007215

PDF HTML XML Export Cite reminder

Hybrid Data Augmentation Framework Based on Controllable Explanation
DOI:
                        10.13328/j.cnki.jos.007215
                    
Author:
                        SUN Ze-ChenSUN Ze-Chen
School of Computer Science and Technology, Soochow University, Suzhou 215008, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
XIAO Yi-ShengXIAO Yi-Sheng
School of Computer Science and Technology, Soochow University, Suzhou 215008, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
LI Jun-TaoLI Jun-Tao
School of Computer Science and Technology, Soochow University, Suzhou 215008, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
ZHANG MinZHANG Min
School of Computer Science and Technology, Soochow University, Suzhou 215008, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site
ZHOU Guo-DongZHOU Guo-Dong
School of Computer Science and Technology, Soochow University, Suzhou 215008, China
Find this author on CNKI
Find this author on BaiDu
Search for this author on this site

                    
Affiliation:
Clc Number:TP18
Fund Project:

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

Previous pre-trained language models (PLMs) have demonstrated excellent performance in numerous tasks of natural language understanding (NLU). However, they generally suffer shortcut learning, which means learning the spurious correlations between non-robust features and labels, resulting in poor generalization in out-of-distribution (OOD) test scenarios. Recently, the outstanding performance of generative large language models (LLMs) in understanding tasks has attracted widespread attention, but the extent to which it is affected by shortcut learning has not been fully studied. In this paper, the shortcut learning effect of generative LLMs in three NLU tasks is investigated for the first time using the LLaMA series models and FLAN-T5 models as representatives. The results show that the shortcut learning problem still exists in generative LLMs. Therefore, a hybrid data augmentation framework is proposed based on controllable explanations as a mitigation strategy for the shortcut learning problem in generative LLMs. The framework is data-centric, constructing a small-scale mix dataset composed of model-generated controllable explain data and partial original prompting data for model fine-tuning. The experimental results in three representative NLU tasks show that the framework can effectively mitigate shortcut learning, and significantly improve the robustness and generalization of the model in OOD test scenarios while avoiding sacrifice of or even improving the model performance in in-distribution test scenarios. The solution code is available at https://github.com/Mint9996/HEDA.

Key words:shortcut learning;generative pre-trained language model;natural language understanding

Get Citation

孙泽辰,肖义胜,李俊涛,张民,周国栋.基于可控性解释的混合数据增强框架.软件学报,2025,36(4):1604-1619

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:October 18,2023
Revised:February 03,2024
Adopted:
Online: June 20,2024
Published:

You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address：4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code：100190
Phone：010-62562563 Fax：010-62562533 Email：jos@iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063

微信服务号

微信订阅号

Get Citation

Share

微信扫一扫：分享

Article Metrics

History