口语理解是任务型对话系统的关键任务, 主要由语义槽填充和意图识别两个子任务组成. 目前主流的方法是对语义槽填充和意图识别进行联合建模. 虽然这种方法在语义槽填充和意图识别上都取得了不错的效果, 但依然存在联合建模中意图识别和语义槽填充交互过程的错误传播问题以及多意图场景下多意图信息与语义槽信息的错误对应问题. 针对上述问题, 提出一种基于图注意力网络的多意图识别与语义槽填充联合模型(WISM). WISM模型通过细粒度(单词级)意图与语义槽建立单词级别一对一映射关系以修正多意图信息与语义槽之间的错误对应关系, 然后通过构建单词-意图-语义槽的交互图, 并利用细粒度下的图注意力网络建立两个任务之间的双向联系以此来降低交互过程中错误传播问题. 在MixSINPS和MixATIS数据集上的实验结果表明, WISM相较于现有的最新模型在语义准确率分别提高2.58%和3.53%. 所提模型在提高语义准确率的同时展示了多意图信息与语义槽之间的映射关系.
Spoken language understanding is a key task in task-based dialogue systems, mainly composed of two sub-tasks: slot filling and intent detection. Currently, the mainstream method is to jointly model slot filling and intent detection. Although this method has achieved good results in both slot filling and intent detection, there are still issues with error propagation in the interaction process between intent detection and slot filling in joint modeling, as well as the incorrect correspondence between multi-intent information and slot information in multi-intent scenarios. In response to these problems, this study proposes a joint model for multi-intent detection and slot filling based on graph attention networks (WISM). The WISM established a word-level one-to-one mapping relationship between fine-grained intentions and slots to correct incorrect correspondence between multi-intent information and slots. By constructing an interaction graph of word-intent-semantic slots and utilizing a fine-grained graph attention network to establish bidirectional connections between the two tasks, the problem of error propagation during the interaction process can be reduced. Experimental results on the MixSINPS and MixATIS datasets showed that, compared with the latest existing models, WISM has improved semantic accuracy by 2.58% and 3.53%, respectively. This model not only improves accuracy but also verifies the one-to-one correspondence between multi-intent and semantic slots.